Print whole blocks initiating with a record start regex, containing a $pattern regex.
sed -n "/$record_start_regex/{x;/$pattern/Ip;d}; \${x;G;/$pattern/Ip}; {H}"
awk '/Pat1/{g=1; next} /Pat2/{g=0} g'
Within the given start/end pattern range, insert new content before $pattern.
sed "/$start_pattern/,/$end_pattern/s/$pattern/$new_content\n&/g"
Convert CSV data to a markdown table
sed '1p; 1s/[^,]\+/:---:/g; s/,/ | /g; s/^.*$/| \0 |/g'
Print only the inner lines inside the patterns. Can similarly delete those lines by replacing ‘p’ with ’d'.
sed -n '/Pat1/,/Pat2/{//!p}'
Extract only the From or Subject headers of an email:
sed -rn '1,/^$/{/^(from|subject):/Ip}'
Convert an html table to a CSV format:
# Strips out just the table from the html document
# Removes any html inner comments
# Converts to CSV, assuming first row to be header row
# Wraps each field in double quotes
# Removes double quotes in header line (in the sed '1 s/"//g' statement)
# Removes any formatting markup from header line
sed -re "/<table/,/<\/table/!d" \
-e "s/.*(<table)/\1/; s/(<\/table>).*/\1/;" \
-e 's/<!--.*-->//g; s/^[[:space:]]*//g; s/[[:space:]]*$//g' |\
tr -d '\n' |\
sed -re 's/<\/(TR|THEAD|TBODY)[^>]*>/\n/Ig' \
-e 's/<\/?(TABLE|TR|THEAD|TBODY)[^>]*>//Ig' \
-e "s/\"/'/g" |\
sed -re 's/^<T[DH][^>]*>|<\/?T[DH][^>]*>$/"/Ig' \
-e 's/[[:space:]]*<\/T[DH][^>]*><T[DH][^>]*>/","/Ig' \
-e '/^[[:space:]]*$/d' \
-e '1s/"//g; 1s/<\/?[^>]*>//g'
Extract a VCARD record matching pattern:
sed -rne "/BEGIN:VCARD/{x;/$pattern/I{s/END:VCARD/\0\n/;p};d};" \
-e "\${x;G;/$pattern/Ip}; {H}"
Rearrance CSV columns:
awk 'BEGIN{FS=","; OFS=","}{print $4,$3,$2,$1}'
Most frequent word count, words of minimum length:
# Replace the bracketed parameters with your preferred values
tr -cs A-Za-z '\012' | tr A-Z a-z | egrep "\<.{<min_chars>,}\>" | sort | uniq -c | sort -nr | head -<top_results>
Questions, comments? Connect.