Work

Utilizing Literature to Characterize Materials from Images

Public

The past decade has seen the rapid progress of deep learning, which becomes a game-changing technique in different data-intensive domains, with the availability of large scale data, cost-effective computing hardware and more advanced learning theory and algorithms. Despite of the rapid progress of deep learning methods in daily-life applications, such as face recognition, video enhancement, image classification, there are some challenges that prevent the application of deep learning into more research fields, such as materials science. Materials science have been developed through the empirical correlation of processing and properties for thousands of years. Recently, tons of experimental and simulated data are captured/produced everyday due to the fast image acquisition devices and super computing facilities. The success of deep learning techniques in other fields ( e.g. computer vision) motivates researchers in materials science to develop more advanced algorithms to accelerate the process of discovering and designing new improved materials with desired properties. Unfortunately, applying deep learning techniques in materials science still remains at its early stage and requires more efforts from researchers. In this thesis, I will present my work in understanding the characterization of materials from images. One challenge in developing data-driven algorithms in materials science is the lack of well-labeled datasets (e.g. microscopy images). In fields that dealing with natural image classification or detection tasks, large amount of images are annotated by human annotators (e.g. ImageNet, MS-coco), however, it would be expensive and even not feasible in the field of materials science, due to its requirement of sufficient domain expertise. To this end, we present our work in construction of Materials dataset from scientific literature, in which we developed an effective tool to construct a self-labeled electron microscopy dataset of nanostructure images. In the second part of the thesis, I will present our work on the interpretation of spectrographs. For the purpose of understanding the insights behind these measurements, data points are usually displayed in graphical form within scientific journal articles. However, it is not standard for materials researchers to release raw data along with their publications. As a result, other researchers have to use interactive plot data extraction tools to extract data points from the graph image, which makes it difficult for large scale data acquisition and analysis. Therefore, we propose the Plot2Spectra pipeline, which enables an efficient spectra data extraction from plot images in an fully automatic fashion. As the last part of the thesis, I will present our work in deducing structure information from STEM (Scanning Transmission Electron Microscopy) measurements. Microscopic imaging providing the real-space information of matter in a large range of scale, which plays an important role for understanding the correlations between structure (e.g. morphology, phase, atomic structure, surface facet, interfacial structure) and properties in the field of materials science. Thus, extracting the structural information (e.g. atomic positions ) plays a very important role in exploration of the crystallographic phases, atomic configurations and the insights behind the structure related material-specific properties and performance. However, it is a challenging task to deduce the structure information from STEM measurements. To this end, we present a representation learning framework for HAADF-STEM image retrieval, named STEM2SIM, to deduce the structure information (e.g. crystalline structure) from the given STEM image by efficiently find the similar image (i.e. known structure) from a simulated dataset.

Creator
DOI
Subject
Language
Alternate Identifier
Keyword
Date created
Resource type
Rights statement

Relationships

Items