Chapter 33 Perl one-liner
Perl one-liners are small and awesome Perl programs that fit in a single line of code and they do one thing really well. These things include changing line spacing, numbering lines, doing calculations, converting and substituting text, deleting and printing certain lines, parsing logs, editing files in-place, doing statistics, carrying out system administration tasks, updating a bunch of files at once, and many more.
33.1 Convert a fastq file from FASTQ to FAST
perl -e 'while(<>){$id=$_; $seq=<>;<>;<>; print ">$id$seq";}' ./data/WGBS_example_data/EV1.fastq |head -10
## >@EV1.1 SN603_WB082:5:1101:49.70:115.70 length=51
## TGTTTTTGGGTTAATTAATATTAATTAAATATTTTAATATATTTTTATATA
## >@EV1.2 SN603_WB082:5:1101:30.70:119.80 length=51
## TTGATGGGTATTTTAATTGGTATTTAATTTATTGTTGAGGGTTTTATTATT
## >@EV1.3 SN603_WB082:5:1101:60.60:116.40 length=51
## TTAATGTATGTTTTGTATTTATTGAATAGTTTGGTTTTTATTATTTATTTT
## >@EV1.4 SN603_WB082:5:1101:123.10:108.70 length=51
## GATTATATTATAAAATTATAATTAATGTATGTTTTGTATTTATTGAATAGT
## >@EV1.5 SN603_WB082:5:1101:108.00:120.40 length=51
## TGATTATTTTGAAGTAAAAAGGGGTGATTTAGTTTGTGTTGTTGGTTGGGT
Now how about finding all repeated lines in a file?
perl -ne 'print $_ if $record{$_}++' ./data/WGBS_example_data/EV1.fastq
How about numbering lines? Super simple! Perl has the $.
special
variable that maintains the current line number. You can just print it out together with the line:
perl -ne 'print "$.:$_" if /gene3/' ./data/DEG_list.txt
## 4:gene3 -2 0.06
This is equivalent of the grep
comand line shown below:
grep -n 'gene3' ./data/DEG_list.txt
## 4:gene3 -2 0.06
perl -e 'while(<>){$id=$_; $seq=<>;<>;<>; print ">$id$seq";}' ./data/WGBS_example_data/EV1.fastq |head -10
## >@EV1.1 SN603_WB082:5:1101:49.70:115.70 length=51
## TGTTTTTGGGTTAATTAATATTAATTAAATATTTTAATATATTTTTATATA
## >@EV1.2 SN603_WB082:5:1101:30.70:119.80 length=51
## TTGATGGGTATTTTAATTGGTATTTAATTTATTGTTGAGGGTTTTATTATT
## >@EV1.3 SN603_WB082:5:1101:60.60:116.40 length=51
## TTAATGTATGTTTTGTATTTATTGAATAGTTTGGTTTTTATTATTTATTTT
## >@EV1.4 SN603_WB082:5:1101:123.10:108.70 length=51
## GATTATATTATAAAATTATAATTAATGTATGTTTTGTATTTATTGAATAGT
## >@EV1.5 SN603_WB082:5:1101:108.00:120.40 length=51
## TGATTATTTTGAAGTAAAAAGGGGTGATTTAGTTTGTGTTGTTGGTTGGGT
33.2 Escape single quote
You can’t use single quotes alone. You need to escape them correctly using ‘'’ This works:
echo "a,b" | perl -F',' -lane 'print "'\''$F[0]'\''";'
## 'a'
The \047
is an octal escape. it’s actually just a single quote. when working with embedded quotes, it’s sometimes easier to just write \047 rather than something like '\''
.
echo "a,b" | perl -F',' -lane 'print "\047$F[0]\047";'
## 'a'