Separation of star-galaxies using machine learning algorithms applied to preliminary data from the MINIJPAS survey.

Name: Pedro Otavio Souza Baqui
Type: PhD thesis
Publication date: 08/05/2020
Advisor:

Namesort descending Role
Luciano Casarini Co-advisor *
Valerio Marra Advisor *

Examining board:

Namesort descending Role
Davi Cabral Rodrigues Internal Examiner *
Júlio César Fabris Internal Examiner *
Luciano Casarini Co advisor *
Luis Raul Weber Abramo External Examiner *
Miguel Boavista Quartin External Examiner *
Oliver Fabio Piattella Internal Examiner *
Valerio Marra Advisor *

Summary: Future astrophysical research such as JPAS will produce huge datasets never seen before, reaching a rate of 150 TB per day. Therefore, new tools for processing this amount of data must be employed. Preferably they will provide us with an almost real-time response in an efficient and accurate manner. Ideal scenario for the application of Machine Learning methods. In this work In this work we analyzed data from the Pathfinder miniJ-PAS Survey, which observed ~1deg2 over the AEGIS field with 56 narrowband filters and 4 ugri broadband filters. Here, we will discuss the classification of miniJPAS sources into point and extended objects, a necessary step for subsequent scientific studies. Our goal is to develop an ML classifier complementary to traditional tools based on other models. In particular, our goal is to build a value-added catalog with our best classifications. To train and test our classifiers, we cross-check the miniJPAS data set with the SDSS and HSC-SSP data, whose classification we assume is reliable with in the 15 < r < 21 and 18:5 < r < 23:5 ranges, respectively. We trained and tested 6 different ML algorithms in the two cross-referenced catalogs: K-neighbor (KNN), decision trees (DT), random forest (RF), artificial neural nets (RNA), extremely randomized trees (ERT) and classification ensemble (EC). As input for the ML algorithms, we use the magnitudes of the 60 filters, with and without morphological parameters. We concluded that, according to the SDSS classification, the EC algorithm presents better performance, obtaining AUC = 0:9992 (area under the ROC curve) and MSE = 0:009 (mean square error). By working with weaker magnitudes using the HSC-SSP rating, the EC achieves the best performance, obtaining AUC = 0:9744 and MSE = 0:0370. The latest results are obtained using photometric bands along with morphological parameters. ML algorithms can compete with traditional star-galaxy classifiers, potentially outperforming the latter in weaker magnitudes (r &#8805;&#917760; 21). Finally we built a catalog for the 15 &#8804; r &#8804; 23:5 range using machines trained from the merger of labels between the SDSS and HSC-SSP surveys.

Access to document

Acesso à informação
Transparência Pública

© 2013 Universidade Federal do Espírito Santo. Todos os direitos reservados.
Av. Fernando Ferrari, 514 - Goiabeiras, Vitória - ES | CEP 29075-910