职位描述:
Data Scientist - LLM Applications
RESPONSIBILITIES
The objective of this position is to drive innovation in projects focused on the development and
application of data analytics tools to support our engineering (e.g., ASU engineering) and
operations (e.g., ASU, SMR, Electronics Carry Gas). Focusing on the business needs of Air
Liquide in the context of our customer-centric transformation, the researcher will be responsible
to:
- Collaborate with business entities to define the problem and business requirements,
translate them into functional design specifications, and develop solutions. Identify,
evaluate and select industrial or academic partners as needed.
- Lead and participate in the design, development, and deployment of AI solutions based
on Large Language Models (LLMs) to address key challenges in industrial, R&D, and
healthcare domains.
- Lead the fine-tuning and optimization of open-source LLMs (e.g., Llama, Qwen,
DeepSeek) for specific business scenarios (such as technical document comprehension,
process parameter optimization, safety report analysis, and scientific knowledge mining).
- Expertly apply Retrieval-Augmented Generation (RAG) techniques, integrating internal
knowledge bases (e.g., technical patents, engineering manuals, research reports) with
external data to build high-accuracy intelligent Q&A, content generation, and knowledge
management systems.
- Test and verify the performance of solutions with prototypes developed.
- Define and develop business tools based upon the prototype performance verification,
ensure transfer of the tool to the operational entities and provide support for the
industrial deployment.
- Train team members on the details of the implemented methodology, thus ensuring
sustainability of the solution for Air Liquide.
- Support knowledge transfer within Air Liquide. Publish research in internal R&D reports,
at conferences and potentially in peer-reviewed journals.
- Work with IT, internal, and external organizations to obtain, clean, visualize, and analyze
data.
- Continuously track the latest advancements in NLP, LLM, and Generative AI (GenAI)
(e.g., Agents, Multi-modality), evaluating and introducing new technologies to enhance
team capabilities.
EXPECTED BACKGROUNDS
- M.S. or Ph.D. in Computer Science, Artificial Intelligence, Statistics, Mathematics,
Engineering or related fields. Independent and inter-disciplinary research experience are
preferred.
- Solid, practical experience in LLM fine-tuning with a deep understanding of its principles.
- In-depth understanding of RAG architecture with at least one complete, deployed RAG
project. Familiarity with relevant frameworks.
- Excellent fundamental understanding of statistics (e.g. distributions, probability, linear
regressions) is a must. Knowledge of advanced statistics (e.g. clustering, elastic net,
MLE, dimension reduction (PCA, PLS, etc), stochastic process, bayesian network, time
series models) and machine learning models (e.g. decision trees, random forest, SVM)
are of benefit.
- Programming experience with R and Python are preferred. Knowledge of Java, C++, or
Javascript is also of benefit.
- Excellent communication and interpersonal skills (written and oral). Must be comfortable
to work in English on a daily basis and in a multi-disciplinary and international team.
Knowledge of French is of benefit.
PREFERRED BACKGROUNDS
- Project experience in industrial manufacturing (e.g., chemical, energy), semiconductors,
healthcare, or supply chain is preferred.
- Familiarity with the selection, deployment, and optimization of vector databases (e.g.,
Milvus, Pinecone, Chroma).
- Familiarity with AI services and tools on at least one cloud platform (AWS, Azure).
- Experience with AIGC, multi-modal models, or AI Agent development.
- MLOps experience (model deployment and serving), familiar with tools like Docker,
Kubernetes, FastAPI/Gradio.
- Self motivated individual with ability to define and solve problems in collaborative ways
across teams from different backgrounds.
- Publications in top-tier AI/NLP conferences or journals are a plus.
LOCATION
Shanghai, China
ABOUT AIR LIQUIDE
A world leader in gases, technologies and services for Industry and Health, Air Liquide is
present in about 80 countries with approximately 68,000 employees and serves more than 3
million customers and patients. Oxygen, nitrogen and hydrogen are essential small molecules
for life, matter and energy. They embody Air Liquide’s scientific territory and have been at the
core of the company’s activities since its creation in 1902.