Data Library Production Engineer

San Francisco, CA

 

Palamida produces the most comprehensive Open Source Software Detection and Reporting Library in the world. Through extensive manual and automated collection efforts, we've amassed over three terabytes of Open Source software from major code repositories and important secondary sources. Palamida's patented data compilation system operates in a clustered server environment to produce very compact and efficient data libraries that are the core of our detection and reporting solutions. The ongoing implementation and operation of these systems and their associated large datasets provides a technically challenging and exciting area of work. We're looking for a Data Library Production Manager to lead ongoing data production and the implementation of new data processing capabilities.

The Data Library Production Engineer has overall editorial and production responsibility for Palamida's family of data library products. The successful candidate will work with software development and quality engineers, OSS collection associates, and IT staff. This is a newly established position in the Engineering organization, reporting to the Director of Technical Operations.

Responsibilities
  • Planning and tracking of data library production runs from data collection to library testing and release
  • Maintaining a prioritized list of repositories and projects for collection based on customer and Professional Services requests
  • Editorial review of collected and edited data
  • Management of automated and manual data collection activities
  • Planning, deployment, and ongoing management of the data production server clusters (hardware and software)
Qualifications
  • Familiarity with Open Source licenses, projects and repositories (e.g. SourceForge, Apache, Eclipse, Open Source Initiative)f the data production server clusters (hardware and software)
  • Usage and administration of Linux, Unix, or Solaris systemsf the data production server clusters (hardware and software)
  • Basic database administration skillsf the data production server clusters (hardware and software)
  • Basic perl scripting skillsf the data production server clusters (hardware and software)
  • QA and System Administration experience is highly desirable

Please include "Data Library Production Engineer position" in the subject line when submitting your resume.