Wikipedia defines Open source intelligence (OSINT) is a form of intelligence collection management that involves finding, selecting, and acquiring information from publicly available sources and analyzing it to produce actionable intelligence.
In reality, the methodology used in OSINT is the information gathering phase of every penetration phase. They only stuck a fancy name to the process.
Regardless of the name, OSINT is very useful, and it's results can be very well used even outside of the penetration testing process.
The information gathering, or OSINT process can be summarized in the following steps:
- Identify your point of interest - who/what is your target of investigation. Start broad, and then narrow down to the interesting elements. For instance, start with a domain name or an IP address pool for a provider, until you find the contacts and names of actual persons. Then you can start drilling for material left on the Internet by them for further useful clues
- Collect information from multiple sources - consult search engines corporate sites, mailing list servers, even the old and forgotten Usenet might be useful
- Sift through the gathered information to form a useful result- Identify interesting pieces of intelligence for further use
The process looks very simple on paper, but bear in mind that most searches generate tons and tons of possible clues and/or false leads. It takes
Here is what you'll have to deal with:
- Irrelevant/false hits on a keyword - URL links or sites that contain the same sequence of words but in totally different context. The more generic the terms that you are searching for, the more of these there will be.
- Fake contacts placed during registration process - looking for that all important 'Who' behind some site or document? Bear in mind that contact information on the web is usually fake to avoid pestering sales persons. And anyone can use your target's name for an alias on a registration.
- Hundreds or thousands of archived messages from forums and mailing lists - much like the previous one, aliases and nearly useless communication can be found and needs to be sifted through. And you cannot be certain that you are looking at something written by your target of investigation
- Documents with irrelevant word matching - a large enough digital book will contain all the words of virtually any phrase
There are a lot of tools that will help you on your quest for information, but I'll sum-up those that I find useful
Google hacking - The title says it all. Choose your keywords and then drill for data on google
Maltego CE - a client side program that drills the Internet for information on the element that you have chosen as source. It will return all kinds of possible information for further drill down. Produces a lot of false positives
Silobreaker - an information correlation and pattern recognition system that returns results as summarized information clusters related to your search query. Not always very accurate, so always use other sources.
Talkback and comments are most welcome
Security Information Gathering - Brief Example
Corporate Security - Are the hackers winning?