April 10th, 2013 NYEnterprise Search User Group - Scaling Search for Large Enterprise


On Wednesday, April 10, 2013, OLC attended NY Enterprise Search User Group’s event, Scaling Search for Large Enterprise with John Back as the speaker. [Kamran Khan was to be the original speaker, but due to scheduling conflict, he was unable to make it.]


Search Technologies is a consulting organization that specializes in search. They were founded in June of 2005. They currently have over 100 employees, 400 and growing customers worldwide and they have a presence in the United States, Latin America, UK and Germany. Search Technologies is deep enterprise search expertise. They report consistent revenue growth and profitability. “All we do is search. We don’t do anything else,” John Back said. “We just do search projects.”

“What we do is independently provide enterprise search product services and expertise,” Back said. “We help develop and support search-based applications using all major search engines. We support internal teams and systems integrations. We have the methodology and technology for content processing.”

Back revealed that Search Technologies works with big companies like Microsoft, Amazon and Google, as examples. “Everyone uses search in all enterprises,” he said.

“Search can be very complex. The problem is that people misunderstand that search is easy. They only think about the search box. They think that by just typing in a question, people get an answer, but it’s not that. When you’re an organization trying to get good search systems and didn’t design it the right way at the beginning, you’re going to have a tough time with the end result,” Back said. To work around this, he outlined his methodology: Assess the search need, understand the content, implement it.

“We emphasize assessment,” Back said. “It’s the key to answering, ‘How do I figure out what it takes to make search successful?’” Back said to ask questions first and document the analysis that followed. The engagement methodology of Search Technologies is as follows: Assessment—evaluate technical situations, organize objective and business drivers, document findings—Statement of work, Implementation—focus on technical execution and quality, tightly manage objective as per Assessment, ensure completion to timeframe and budget—and Completion.

He also outlined people that should be in the project meetings when assessing and evaluating the situation: Project Sponsor or Lead Stakeholder, a business or IS member responsible for assessment and subsequent implementation; System Administrator, IT staff responsible for hardware infrastructure; Search Administrator, responsible for supporting search; Security Administrator, responsible for system and document security; Data Owners, content that is key/relevant to this exercise.

Back briefly described the typical topics covered by the consultant. “You should talk about the consultant’s personnel. Complete the tour of existing infrastructure and review the usage of the system. Be sure to review the installation of the system, just in case,” he said.

“Once we get all of the information, we document it and create a report. The documentation should be detailed enough to support and establish understanding between the consultant, project manager and the customer,” Back said. “To fix your problem, we need to understand it. Then we detail that in a step-by-step basis. This creates credibility.”

Back turned to his second point, understanding content. “The second most important thing after assessing problems is understanding the content. A common misconception is that search makes content sources transparent. That’s simply not true,” he said. “In content processing, many apps really deal with data in their own unique ways. Unstructured data is often not in an ideal format—some are originally created for human eyes, or it’s missing metadata, or it’s extraneous content, or it needs to chunk or join content. Content processing is the process of cleaning, restructuring and reformatting data. It’s making it suitable for consuming apps,” Back said.

He described the DPM [Document Processing Methodology] of Search Technologies. “First, understand the Document Model. Next, understand the User Model. Finally, Create a Search Engine Model. Remember to document everything,” Back said. “Basically, assess, write a detailed analysis, then implement the solution.”

Back talked about Search Technologies’ new architecture, Aspire. “We developed a pipeline architecture called Aspire. Essentially, if you don’t have pipeline technology to process data, you’re going to have to do it yourself. It sorts data, it’s basically an assembly line for processing content,” Back said. “Aspire is a framework to support high volume, high performance back-office content processing. It’s a toolkit we use to create components needed to search implementations, including connections. Aspire helps serious developers supplement and search engine to enable and optimize solutions,” he said.