We are looking for a Data Scientist with the education and experience to work effectively with new start-up companies and existing businesses to maximize their potential to create world-class offerings based on data analytics methods and applied research in machine learning.
The ideal candidate must be comfortable working with a wide range of stakeholders, functional teams, and applications domains (agriculture, aquaculture, environment, green energy, health care, business) as well student (senior undergraduate and graduate) research assistants.
The ideal candidate must be familiar with a data mining project life cycle (such as CRISP-DM) and have experience at planning and successfully executing data analytics projects.
The ideal candidate must have experience using a variety of data mining/data analysis methods, using a variety of data tools, building and implementing models, using/creating algorithms and creating/running experiments to test such methods.
The right candidate will have a passion for discovering solutions hidden in large data sets and working with stakeholders to improve business outcomes and create new companies who use data analytics as a core capability.
Responsibilities and Duties:
- Pre-Project Work
- Meet with business representatives, communicate the nature of data analytics engagements by AIDA and conveying the relevant data analytics concepts.
- Determine the nature of the opportunity to which data analytics may be applied, outline the technical problem and project objectives.
- Develop a project plan including the needed human and material resources, costs and timeline.
- Successfully apply for and manage funds for applied data analytics research projects
- Undertake Data Analytics Projects
- Business Understanding: Work with start-up companies and existing business new ventures to translate a business opportunity or problem into a data analytics problem. Determine the success criteria. Properly assess the technical viability of undertaking a data analytics project.
- Data Understanding: Identify required data sources, availability and cost and help companies automate the collection of such data and generate a meta data report.
- Data Preparation: Determine and apply the appropriate data engineering steps including data cleaning, consolidation and preparation.
- Model Development and Evaluation: Determine and apply the appropriate statistical modeling, machine learning and evaluation, and/or data visualization software and methods to create a solution.
- Model Deployment: Assist industry partner in development and testing of operational systems based on project results, if necessary.
- Project Reporting: Create a professional quality project summary report and present the results.
- Apply for Research Funding
- Work with the Director, the RIC and the OICE to apply for funding for applied projects
- Apply for funding for events, travel, equipment, as may be needed from time to time.
- Supervision of Research Assistants
- Supervise senior undergraduate and graduate research assistants who are working on projects.
- Engage in academic or industry publications where possible for the general good of those working in the data analytics area.
- Promotion, Outreach and Training
- Help deliver the mission of AIDA.
- Work with the Director and AIDA staff to plan, create content and delivery talks, tutorials, short courses, and symposium on data analytics methods.
- Assist the Director and AIDA staff on administrative issues such as annual reporting and budget preparation.
Desired Qualifications and Skills
- 2 years of experience manipulating data sets, building statistical / machine learning models
- Master’s or PHD in Computer Science with major in Data Science or Data Analytics
- Has strong problem solving skills and applied research interests
- Is able to manage and/or work on projects in diverse domains with various functional teams
- Knowledge of the data mining life cycle (eg. CRISP-DM) and experience managing projects that include: business understanding, data understanding, data preparation and data visualization, predictive model development and evaluation, and model deployment and project reporting.
- Knowledge and experience with the preprocessing of structured and unstructured data and its preparation for model development.
- Experience in the area of data engineering – reducing massive quantities of dirty data to smaller cleaner data for development and production use.
- Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, deep learning, etc.) and their real-world advantages/drawbacks.
- Knowledge of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience with associated tools and libraries (eg. R, python, Tensorflow, Pytorch).
- Experience with model performance evaluation using techniques such as cross-validation and statistical hypothesis testing.
- Excellent written and verbal communication skills for advising and managing diverse teams.
- A drive to learn and master new technologies and techniques.
- Familiar with all or most of the following software/tools:
- Coding knowledge and experience with several languages: C, C++, Java
- Knowledge and experience in statistical and data mining techniques including unsupervised, supervised, semi-supervised learning, and potentially transfer learning.
- Experience querying databases and using statistical computer languages: SQL, R, Python, Matlab, etc.
- Experience using web services such as AWS, Redshift, S3, Spark, DigitalOcean, etc.
- Experience creating and using advanced machine learning algorithms and statistics: regression, simulation, scenario analysis, modeling, clustering, decision trees, neural networks, deep learning, recurrent neural networks, etc.
- Experience analyzing data from third party providers such as AWS Analytics, Google Analytics, Microsoft Azure, Facebook Analytics, BigML, etc.
- Experience with data engineering tools such as Map/Reduce, Hadoop, Hive, Spark, Gurobi, MySQL, etc.
- Experience visualizing/presenting data for stakeholders using: Periscope, Business Objects, D3, ggplot, etc.
For more information please contact Daniel L. Silver: firstname.lastname@example.org