Senior Data Scientist

Responsibilities include:

  • Conduct in-depth empirical research on trading strategies and test results of such strategies
  • Establish scalable, efficient, automated processes for large scale, time-series data analysis
  • Manipulate and analyze complex, high volume, time-series data sets
  • Develop distributed/parallel solutions for data analysis, mining and visualization of structured and unstructured time-series data sets
  • Develop, optimize, parallelize and implement novel algorithms for statistical modeling and data mining of complex time-series data on high performance multicore and distributed architectures, including IntelPhi, distributed clusters and Cuda (desired but not essential)
  • Serve as the Machine Learning expert on cross-functional teams, working throughout product development life cycles and in support of production trading operations
  • Monitor and continuously evaluate new methodologies and third-party technologies addressing analysis of large data sets applicable to time-series data
  • Perform data analysis in support of decision making in diverse areas of the business such as technology assessment, method development, algorithm development, process/system testing and validation, as well as process optimization/automation
  • Select and apply appropriate machine-learning methodologies to deliver critical/actionable results and provide timely guidance to management and team leaders
  • Deliver crucial components used to validate algorithmic trading strategies
  • Manage other data scientists and collaborate with team and project leaders to ensure efficient utilization of company assets designated for data analytics

Requirements:

  • Advanced degree in Computer Science with an emphasis on data mining, machine learning, statistics or a related discipline with 3+ years of work experience
  • Experience in handling gigabyte and terabyte size data sets and working with distributed systems
  • Expertise in developing algorithms and applications using MapReduce, MPI, OpenMP or similar frameworks
  • Fluent in theory and application of standard machine learning or data mining algorithms
  • Expertise in applying descriptive and inferential statistics to real world Big Data problems
  • Able to utilize in-house file systems, databases, and data flow control systems built in C++, with new languages and technologies continuously being evaluated
  • Familiarity with big data technologies with ability to identify the best technology for a given problem
  • Comfortable with C/C++ and scripting (awk and python preferred)
  • Able to integrate and apply feedback in a professional manner
  • Able to lead and work as part of a team