Chapter 33 Perl one-liner

Perl one-liners are small and awesome Perl programs that fit in a single line of code and they do one thing really well. These things include changing line spacing, numbering lines, doing calculations, converting and substituting text, deleting and printing certain lines, parsing logs, editing files in-place, doing statistics, carrying out system administration tasks, updating a bunch of files at once, and many more.

33.1 Convert a fastq file from FASTQ to FAST

perl -e 'while(<>){$id=$_; $seq=<>;<>;<>; print ">$id$seq";}' ./data/WGBS_example_data/EV1.fastq |head -10
## >@EV1.1 SN603_WB082:5:1101:49.70:115.70 length=51
## TGTTTTTGGGTTAATTAATATTAATTAAATATTTTAATATATTTTTATATA
## >@EV1.2 SN603_WB082:5:1101:30.70:119.80 length=51
## TTGATGGGTATTTTAATTGGTATTTAATTTATTGTTGAGGGTTTTATTATT
## >@EV1.3 SN603_WB082:5:1101:60.60:116.40 length=51
## TTAATGTATGTTTTGTATTTATTGAATAGTTTGGTTTTTATTATTTATTTT
## >@EV1.4 SN603_WB082:5:1101:123.10:108.70 length=51
## GATTATATTATAAAATTATAATTAATGTATGTTTTGTATTTATTGAATAGT
## >@EV1.5 SN603_WB082:5:1101:108.00:120.40 length=51
## TGATTATTTTGAAGTAAAAAGGGGTGATTTAGTTTGTGTTGTTGGTTGGGT

Now how about finding all repeated lines in a file?

perl -ne 'print $_ if $record{$_}++' ./data/WGBS_example_data/EV1.fastq

How about numbering lines? Super simple! Perl has the $. special variable that maintains the current line number. You can just print it out together with the line:

perl -ne 'print "$.:$_" if /gene3/' ./data/DEG_list.txt
## 4:gene3  -2  0.06

This is equivalent of the grep comand line shown below:

grep -n 'gene3'  ./data/DEG_list.txt
## 4:gene3  -2  0.06
perl -e 'while(<>){$id=$_; $seq=<>;<>;<>; print ">$id$seq";}' ./data/WGBS_example_data/EV1.fastq |head -10
## >@EV1.1 SN603_WB082:5:1101:49.70:115.70 length=51
## TGTTTTTGGGTTAATTAATATTAATTAAATATTTTAATATATTTTTATATA
## >@EV1.2 SN603_WB082:5:1101:30.70:119.80 length=51
## TTGATGGGTATTTTAATTGGTATTTAATTTATTGTTGAGGGTTTTATTATT
## >@EV1.3 SN603_WB082:5:1101:60.60:116.40 length=51
## TTAATGTATGTTTTGTATTTATTGAATAGTTTGGTTTTTATTATTTATTTT
## >@EV1.4 SN603_WB082:5:1101:123.10:108.70 length=51
## GATTATATTATAAAATTATAATTAATGTATGTTTTGTATTTATTGAATAGT
## >@EV1.5 SN603_WB082:5:1101:108.00:120.40 length=51
## TGATTATTTTGAAGTAAAAAGGGGTGATTTAGTTTGTGTTGTTGGTTGGGT

33.2 Escape single quote

You can’t use single quotes alone. You need to escape them correctly using ‘'’ This works:

echo "a,b" | perl -F',' -lane 'print "'\''$F[0]'\''";'
## 'a'

The \047 is an octal escape. it’s actually just a single quote. when working with embedded quotes, it’s sometimes easier to just write \047 rather than something like '\''.

echo "a,b" | perl -F',' -lane 'print "\047$F[0]\047";'
## 'a'