PDA

View Full Version : delete line from text file



Waldo2k2
03-29-2005, 12:02 AM
I'm working on doing some basic file sorting on a web server running unix. I'm basically trying to keep track of a few basic things (hack attempts, what ip addresses connect and how often, how much they download, etc.) However, after sorting everything, I want to delete lines containing a certain ip address and/or the - character. However I just cannot figure out what command to use to accomplish this, here's what I have (assuming all log files are named access.log.x where x is a number):


cut -d ' ' -f1 access*>tf2
cut -d ' ' -f10 access*>tf1
echo "Bandwidth usage by file size:">usage.rpt
paste -d ' ' tf1 tf2 | sort -n -r >>usage.rpt
^command to delete needed here

basically the first field from the apache log files is the ip address, and the tenth field is either the size downloaded in bytes or in case of a 404 error a - character. After pasting and sorting by file size in descending order I need to chop out those lines. I have everything accomplished up until deleting those lines, at which point im lost. Any suggestions?? Thank you very much.

Hammer
03-29-2005, 01:28 AM
grep -v text_i_dont_want
like so:


cut -d ' ' -f"1 10" access.log.* | grep -v "-" | sort -nr

Waldo2k2
03-29-2005, 09:03 AM
thanks a bunch, I' used grep for some other things...but I guess I missed the -v in the man pages. Thanks again, I've really been beating my head against the wall on this one.

Hammer
03-29-2005, 05:05 PM
If you're into processing text files like this, you'll probably want to invest some time in awk.

Here's a simple example. There may well be better ways to do what I've done, I am no awk expert!




>cat awk.cmds

BEGIN {
print "My report..."
}

#
# No test here, this action happens for every line
#
{
iplist[$1] += $10;
}

#
# Numeric check on field 10,
# Record bytes uploaded
#
($10+0 == $10) {
TotalBytes += $10;
printf ("IP: %-16s Bytes: %-10d\n", $1, $10);
}

#
# Test for -, meaning no bytes uploaded
#
$10 == "-" {
FailedRequests++;
}

#
# This happens at the end of the run
#
END {
printf ("Total bytes processed: %d\n", TotalBytes);
printf ("Total failed requests: %d\n", FailedRequests);
printf ("\nIP Summary List Follows:\n");

for ( i in iplist )
{
printf ("IP: %-16s TotalBytes: %-10d\n", i, iplist[i]);
}
}

>awk -f awk.cmds access.log.*
....
.... output here... run it and see....

Waldo2k2
03-29-2005, 08:48 PM
I'm definetly going to use awk next time, that way I'll have some more control over all this crap. Thanks again.

Perspective
03-30-2005, 01:01 PM
>>If you're into processing text files like this, you'll probably want to invest some time in awk.

and sed... very handy for quick in-line content replacing among other things.



sed --in-place 's/old-text/new-text/' <file>

dwks
05-07-2005, 06:03 PM
I don't know about awk, but Perl is very handy for that kind of thing.