Análise Comparativa de Ferramentas de IA Generativa na Correção de Redações Nota 1000 no ENEM

Jorge Luis Cavalcanti Ramos; Hebert H. Barboza de Brito; João Carlos Sedraz Silva; André G. Martins; Rodrigo Lins  Rodrigues

doi:10.18264/eadf.v16i1.2556

Authors

Jorge Luis Cavalcanti Ramos Universidade Federal do Vale de São Francisco https://orcid.org/0000-0002-6099-6861
Hebert H. Barboza de Brito Universidade Federal do Vale de São Francisco https://orcid.org/0009-0008-9313-2868
João Carlos Sedraz Silva Universidade Federal do Vale de São Francisco https://orcid.org/0000-0002-4082-9652
André G. Martins Universidade Federal do Vale de São Francisco https://orcid.org/0000-0003-3320-7297
Rodrigo Lins Rodrigues Universidade Federal Rural de Pernambuco https://orcid.org/0000-0002-3598-5204

DOI:

https://doi.org/10.18264/eadf.v16i1.2556

Keywords:

Essay, ChatGPT, Gemini, DeepSeek, Maritaca, Automatic correction

Abstract

This study presents a comparative analysis of four generative Artificial Intelligence (AI) platforms in the correction of 100 essays in the ENEM, which were awarded a score of 1000 by official INEP evaluators. The objective was to evaluate four different platforms for correcting essays with maximum scores and compare these evaluations with the official evaluation, which assigned maximum scores to all of them. The methodological approach included the creation of a database with the themes and texts of the one hundred essays, the creation of scripts for interaction with the platforms, and the storage of the results with feedback for each essay, based on official criteria, for subsequent analysis of the results. Descriptive statistics and inferential statistics were used for the analyses, using ANOVA and Tukey tests. The results pointed to the Brazilian platform Maritaca IA as the one that obtained the best results, with scores closest to those assigned by ENEM evaluators, suggesting its use in applications in which the Portuguese language and Brazilian context must be considered.

Keywords: Essay. ChatGPT. Gemini. DeepSeek. Maritaca. Automatic correction.

Downloads

Download data is not yet available.

References

ABDI, H.; WILLIAMS, L. J. Tukey’s Honestly Significant Difference (HSD) Test. In: Salkind, N. J. (Ed.), Encyclopedia of Research Design. SAGE Publications. 2010.

BRASIL. Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). Cartilha do Participante: Redação no ENEM. Brasília, 2024. Disponível em https://download.inep.gov.br/publicacoes/institucionais/avaliacoes_e_exames_da_educacao_basica/a_redacao_no_enem_2024_cartilha_do_participante.pdf . Acesso em 02 fev. 2025.

CASTELLI, M.; MANZONI, L. Generative models in artificial intelligence and their applications. Applied Sciences, v. 12, n. 9, p. 4127, 2022.

CHATGPT. GPT-4 Technical Report. https://arxiv.org/html/2303.08774v6 . Acesso em 02 fev. 2025.

COSTA, M. P. F. et al. Panorama histórico das propostas de redações do ENEM: um olhar sobre as temáticas e critérios de avaliação. Revista Diálogo Educacional, v. 23, n. 78, p. 1332-1352, 2023.

DEEPSEEK. DeepSeek-v3 technical report. 2024. Disponível em https://arxiv.org/html/2412.19437v1 . Acesso em 02 fev. 2025.

EPSTEIN, Z. et al. Art and the science of generative AI. Science, v. 380, n. 6650, p. 1110-1111, 2023. Disponível em https://arxiv.org/abs/2306.04141 . Acesso em 10 fev. 2025.

GEMINI. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. 2024. Disponível em https://arxiv.org/abs/2403.05530 . Acesso em 02 fev. 2025.

KASNECI, E. et. al. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, v. 103, [s. n.], [s. p.], mar. 2023. Disponível em: https://www.sciencedirect.com/science/article/pii/S1041608023000195 . Acesso em 28 mar. 2025.

KEHOE, F. Leveraging Generative AI Tools for Enhanced Lesson Planning in Initial Teacher Education at Post Primary. Irish Journal of Technology Enhanced Learning, 7(2), 172-182. 2023. Disponível em: https://doi.org/10.22554/ijtel.v7i2.124 . Acesso em 25 fev. 2025.

LONGPRE, S. et al. Bridging the Data Provenance Gap Across Text, Speech and Video. 2025. Disponível em: https://arxiv.org/abs/2412.17847. Acesso em 23 mar. 2025.

MARITACA. Maritaca AI. Disponível em https://www.maritaca.ai/sobre-maritaca-ai. Acesso em 18 fev. 2025.

MONTGOMERY, D. C. Design and analysis of experiments. John wiley & sons, 2017.

PERES, R. et al. On ChatGPT and beyond: How generative artificial intelligence may affect research, teaching, and practice. International Journal of Research in Marketing, v. 40, n. 2, p. 269-275, 2023.

SABIÁ. Sabiá 3 Technical Report. Disponível em https://arxiv.org/pdf/2410.12049 . Acesso em 18 fev. 2025.

SALLAM, M. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthcare, v. 11, n. 6, p. 887, 2023. Disponível https://www.mdpi.com/2227-9032/11/6/887 . Acesso em 25 mar. 2025.

Comparative Analysis of Generative AI Tools in the Correction of Essays with a Score of 1000 in ENEM

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

Information

Language

Consortium universities