Senior Web Mining Engineer

Overview:
We are a fast-growing company thriving through the economic downturn by providing a unique and much needed travel service. We are looking for an outstanding application engineer with strong web-tier development skills and a burning desire to create a great product. Candidate will have a direct impact UpTake's core data acquisition and processing infrastructure enabling us to scale our search index to massive scale.

Responsibilities:
Create a world-class web data-mining system, capable of extracting meaning and structure from unstructured web documents. Create a system that can efficiently and effectively discover, and understand travel-specific knowledge in hundreds of millions of web pages. The successful candidate will own the crawling and data mining algorithms and work with a systems architect to make them fault tolerant and highly scalable across distributed systems.

Experience (required):

  • 5 years hands on experience and world-class expertise in web crawling and extracting structured information from millions of web pages
  • Native speaker of the UNIX command line, XML and regular expressions
  • Experience with algorithms for de-duplication, classification, clustering and with processes for iterative improvement using training data
  • Enjoys working in a collaborative team environment.
  • Able to work effectively as an individual contributor with team of widespread talent
  • Able to explain their architectural and design decisions with a very talented technical team
  • Ability to shift gears quickly in a start-up environment.
  • The successful candidate will have previously worked in small start-up vertical search environment and have a desire to do so again.

Experience (ideal):

  • Hands on experience using high performing SQL with very large data sets
  • 5+ years experience and world-class skills in Java development including intensive and highly performing SQL
  • Real-world experience with distributed processing frameworks systems such as Nutch, Hadoop
  • Is very familiar with common collaboration and code/build management tools such as SVN, Ant, Maven
  • Rich experience with web application technologies: Web services, XML, SOAP, SAX, Ruby on Rails, Active Record and/or J2EE technologies including EJB, Spring, Hibernate

Experience (nice to have)

  • Rich experience with web application technologies: Web services, XML, SOAP, SAX, Ruby on Rails, Active Record and/or J2EE technologies including EJB, Spring, Hibernate


© 2006 - 2012 UpTake Networks, Inc.