We are looking for a highly skilled PyTorch Machine Learning Engineer to join our team in extracting valuable data from over 300,000 carbon emissions research papers. The ideal candidate will have a strong background in natural language processing, computer vision, and PyTorch. The goal is to create a model that can accurately extract textual, tabular, and graphical data from unstructured documents with 100% accuracy.
Responsibilities:
1. Develop and implement machine learning models for data extraction from unstructured documents using PyTorch
2. Preprocess and clean large datasets of research papers on carbon emissions
3. Combine natural language processing techniques with computer vision methods to accurately extract textual, tabular, and graphical data
4. Optimize models for performance, scalability, and accuracy
5. Collaborate with team members to understand project requirements and deliver solutions
6. Continuously stay updated with the latest research and advancements in the field of machine learning and data extraction
7. Document and present the progress and results of the project to stakeholders
Requirements:
1. Master’s degree or higher in Computer Science, Data Science, or related field
2. Strong expertise in PyTorch and deep learning frameworks
3. Proven experience in natural language processing and computer vision techniques
4. Solid understanding of machine learning algorithms and their applications
5. Proficient programming skills in Python
6. Experience with data preprocessing, feature engineering, and model optimization
7. Strong problem-solving skills and the ability to work independently as well as in a team
8. Excellent communication and presentation skills in English
Preferred Qualifications:
1. Previous experience working with unstructured documents, such as research papers
2. Familiarity with carbon emissions data and related research
3. Proficiency in GPU-based computing and parallel processing