Web Page Templates Icons, Clipart, Logos

Blog

Hot Topics

Post Archive

Tags

Aug 04, 2009 12:03 AM EDT

Counting occurances of a string with Grep

I noticed that one of my servers was using quite a bit more of its CPU resources than normal, yet my Analytics software wasn’t showing a spike in traffic. I have a rather large Apache access_log file, and I wanted to see how many times a particular bot scraped my web pages. Looking through it by hand isn’t practical since the log is over 1GB in size.

Instead, what I did was this simple grep command:

grep -c “myregex” access_log

In the quotes, I put the real string that I was searching for. The c flag refers to “Count”, which returns the number of times that regular expression occurs in the file.

In this case, the scraping program that I thought was the culprit had downloaded less than 100 web pages, but the true culprit had downloaded many more. It was using a browser’s User Agent so it’s either a really active visitor, a browser plugin, or a spider spoofing a real browser. To resolve this, I used IPTables to block their IP address. Problem solved.

Darren grep | programming | strings | counting
Displaying 1 post

Online Information for Geeks

 

 

 

 

Resource Links