Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Unix: Find All Text Below String in a File

DZone's Guide to

Unix: Find All Text Below String in a File

Check out how author Mark Needham tackles the issue of parsing out text from the end of a file that occurs after a given string.

· Java Zone ·
Free Resource

The CMS developers love. Open Source, API-first and Enterprise-grade. Try BloomReach CMS for free.

I recently wanted to parse some text out of a bunch of files so that I could do some sentiment analysis on it. Luckily the text I want is at the end of the file and doesn’t have anything after it, but there is text before it that I want to get rid of.

The files look like this:

# text I don't care about

= Heading of the bit I care about

# text I care about

In other words I want to find the line that contains the Heading and then get all the text after that point.

I figured sed was the tool for the job, but my knowledge of the syntax was a bit rusty. Luckily this post served as a refresher.

Effectively what we want to do is delete from the beginning of the file up until the line after the heading. We can do this with the following command:

$ cat /tmp/foo.txt 
# text I don't care about

= Heading of the bit I care about

# text I care about
$ cat /tmp/foo.txt | sed '1,/Heading of the bit I care about/d'

# text I care about

That still leaves an extra empty line after the heading which is a bit annoying but easy enough to get rid of by passing another command to sed that strips empty lines:

$ cat /tmp/foo.txt | sed -e '1,/Heading of the bit I care about/d' -e '/^\s*$/d'
# text I care about

The only difference here is that we’re now passing the ‘-e’ flag to allow us to specify multiple commands. If we just pass them sequentially then the 2nd one will be interpreted as the name of a file.

BloomReach CMS: the API-first CMS of the future. Open-source & enterprise-grade. - As a Java developer, you will feel at home using Maven builds and your favorite IDE (e.g. Eclipse or IntelliJ) and continuous integration server (e.g. Jenkins). Manage your Java objects using Spring Framework, write your templates in JSP or Freemarker. Try for free.

Topics:
unix ,search ,file ,shell ,sed

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}