Overview:
We are a fast-growing company thriving through the economic downturn by providing a unique and much needed travel service. We are looking for an outstanding application engineer with strong web-tier development skills and a burning desire to create a great product. Candidate will have a direct impact UpTake's core data acquisition and processing infrastructure enabling us to scale our search index to massive scale.
Responsibilities:
Create a world-class web data-mining system, capable of extracting meaning and structure from unstructured web documents. Create a system that can efficiently and effectively discover, and understand travel-specific knowledge in hundreds of millions of web pages. The successful candidate will own the crawling and data mining algorithms and work with a systems architect to make them fault tolerant and highly scalable across distributed systems.
Experience (required):
- 5 years hands on experience and world-class expertise in web crawling and extracting structured information from millions of web pages
- Native speaker of the UNIX command line, XML and regular expressions
- Experience with algorithms for de-duplication, classification, clustering and with processes for iterative improvement using training data
- Enjoys working in a collaborative team environment.
- Able to work effectively as an individual contributor with team of widespread talent
- Able to explain their architectural and design decisions with a very talented technical team
- Ability to shift gears quickly in a start-up environment.
- The successful candidate will have previously worked in small start-up vertical search environment and have a desire to do so again.
Experience (ideal):
- Hands on experience using high performing SQL with very large data sets
- 5+ years experience and world-class skills in Java development including intensive and highly performing SQL
- Real-world experience with distributed processing frameworks systems such as Nutch, Hadoop
- Is very familiar with common collaboration and code/build management tools such as SVN, Ant, Maven
- Rich experience with web application technologies: Web services, XML, SOAP, SAX, Ruby on Rails, Active Record and/or J2EE technologies including EJB, Spring, Hibernate
Experience (nice to have)
- Rich experience with web application technologies: Web services, XML, SOAP, SAX, Ruby on Rails, Active Record and/or J2EE technologies including EJB, Spring, Hibernate