Web Page Templates Icons, Clipart, Logos

Aug 03, 2009 06:03 PM

Counting occurances of a string with Grep

I noticed that one of my servers was using quite a bit more of its CPU resources than normal, yet my Analytics software wasn’t showing a spike in traffic. I have a rather large Apache access_log file, and I wanted to see how many times a particular bot scraped my web pages. Looking through it by hand isn’t practical since the log is over 1GB in size.

Instead, what I did was this simple grep command:

grep -c “myregex” access_log

In the quotes, I put the real string that I was searching for. The c flag refers to “Count”, which returns the number of times that regular expression occurs in the file.

In this case, the scraping program that I thought was the culprit had downloaded less than 100 web pages, but the true culprit had downloaded many more. It was using a browser’s User Agent so it’s either a really active visitor, a browser plugin, or a spider spoofing a real browser. To resolve this, I used IPTables to block their IP address. Problem solved.

darren

grep | programming | strings | counting


Comments

Name:
Website URL:
Comment:

 

 


 

Visit www.Vauntium.com for more information.

 

 

Resource Links