Python Grep

Posted : admin On 1/26/2022

Recreating grep in Python

Python Grep Examples Python Grep - 3 examples found. These are the top rated real world Python examples of Grep.Grep extracted from open source projects. You can rate examples to help us improve the quality of examples. How to use GrepFunc provide a single function, grep, which imitates Unix's grep functionality, but operate on lists and variables instead of files. The function accept a single string, an iterable, or an opened file handler to search. The default return value is a list of matching strings from input. Bash cut and grep commands through python. Ask Question Asked 7 years, 8 months ago. Active 4 years, 11 months ago. Viewed 27k times 2. I've tried to read a txt file and find the lines which contain a certain word 'checkoutrevision'. I want to find these lines one. I use it all the time. But I am also doing some text processing in python, and there is one crucial thing that I lack. Usually, I use grep -v to take extraneous stuff out of text.


How to make Python CLI tools.

Tags: pythontutorial

Let's make our own version of grep, nicknamed dumbgrep. Along the way, we'll learn about 19th-century Russian literature and how to make command line interface (CLI) tools in Python.

Why Python? Python's argparse package makes it easy to handle the parsing side of things. And using the Python Package Index (PyPI), you can easily deliver a CLI tool to the writhing masses of humanity.

Baby steps 👶

You'll need Python 3 if you want to follow along. For reference, the full program is available here. Shamefully, I've only tested it on Linux, so there might be extra hoop-jumping required to set it up on Windows.


Python Grep For String


Here's the skeleton of our project.

  • dumbgrep is the Python script that a user can call from the CLI.
  • contains the information we need to package our project and upload it to PyPI for distribution.
  • contains the actual functionality of dumbgrep. If you didn't know, a folder that contains a file called is a Python package. That package's code is saved in the file.

Why not dump all of our Python code into the dumbgrep file? This more complicated structure allows us to split the code into multiple files and even multiple subpackages, which will be useful if the codebase grows too big. It's also easier to add tests this way, if you're boring like that.

Let's write the dumbgrep script. All it does is call the main() function of the dumbgrepcli package, which we'll write later.

The only thing about this that might possibly be unusual to a Python afficionado is the so-called shebang line at the start, which basically informs Unix-like systems that the script should be run using Python 3.

Next, here's what we might write in This determines how to build the package and how to upload it to PyPI.

Most of the fields are self-explanatory. 'name' is the package's name on PyPI, which must be unique. The files under the 'scripts' field will be installed to a place where the user can call them from the command line.

As an aside: if you add a Markdown-formatted README to your project, then a useful trick is to reuse it as the long description of your package on PyPI.

That's the boring stuff out of the way! Now we can move on to plagiarising grep.

G(lobally search for a) R(egular) E(xpression and) P(rint matching lines)

Here's how we start our implementation of grep in

We import Python's argparse module, which we'll use for argument-parsing. We define the long-awaited main() function. There's boilerplate code at the bottom that calls main() when we execute the file directly, just so we can test it. Within main(), we create an ArgumentParser, add a string argument called 'pattern' to it, parse the command line arguments, and finally, print out the value of the 'pattern' argument.

This already gets us a lot of stuff. We have nicely-formatted help, by default.

If a user forgets to provide a pattern, they get a nice error message.

And we can access the value of the 'pattern' argument through args.pattern.

All that remains is to code up the logic of grep. This is rather easy in Python, since it has a built-in regex package.

We create a Pattern object based on the pattern provided by the user, and all lines of input that match this pattern are printed to standard output.


And that's it! We've recreated grep. Let's set up a virtual environment where we can install this bad boy and test it out. (A virtual environment is a self-contained Python installation that you can experiment on without mucking up your main Python installation).

Create and activate the environment
Install dumbgrep
Python grep search
Test it out, then deactivate the environment

In the next section we'll explore argparse a bit more by adding some bells and whistles to dumbgrep.

Python Grep

Milk and sugar

Let's say we want to recreate grep's '-v' flag, which means that only lines NOT matching the input pattern are printed. All we have to do is add a boolean flag to our argument parser to check whether we should invert the matches. And then tweak the matching logic to use that flag.


How about the '--max-count' parameter, which limits the number of lines that grep prints out? We accept the limit as an integer argument, and count the number of matched lines so that we can exit early once the limit has been reached.

It works!

Okay, okay. That's enough of that. There's one last trick I'd like to share before we finish, however: colour highlighting in the terminal. If we want to highlight the matching part of a line, then we can use escape codes to modify font colour in the terminal. First we store the Match object returned by in its own variable, since we'll need it later to isolate the part of the line that matches the pattern. And we call a new function, highlight(), to format the output.

Here's the highlight() function. Main things to note: 1) to avoid having ugly escape codes in our output when we write to a file, we check whether we're writing to a terminal through sys.stdout.isatty(); 2) the first escape code we write changes the colour of all following text to red, and it's only after we write the reset escape code that this effect is undone.

And the result:

Distribute to the clammering public

If we're feeling particularly benevolent and charitable, then we can upload our nifty tool to PyPI. After all, why would anyone want to use the original grep when they could use our version?

Python Grep Files

Oh, right...

Anyway, here's an excellent guide that describes the whole process: There's no point in duplicating the instructions here, since the guide is thorough and straightforward. Once dumbgrep is on PyPI, anyone can download it by running pip3 install dumbgrep-cli, as per the package name we defined in

Python Grep Files In Directory

That's it. The full dumbgrep code is available here. You can use it as a template for your own CLI tools. I've also created 2 actually kinda useful CLI tools that you can check out for inspiration: pseu and bs.

I'd be happy to hear from you at [email protected].