PACLIC 2022

Presentation done at the 36th Pacific Asia Conference on Language, Information and Computation (PACLIC 36)

The presentation took place at the 36th Pacific Asia Conference on Language, Information, and Computation on October 20, 2022, held in Manila, Philippines. The research aimed to craft an embedding structure tailored for the Facebook Dataset.

The findings highlighted that a fusion of fastText word embedding with a sentencer embedding structure within the Seq2seq model, incorporating GRU and attention layers, emerged as the most effective model. Notably, hyperbolic embeddings fell short compared to fastText and Word2Vec embeddings, attributed to a lack of a proper parser. Additionally, Glove embeddings exhibited reduced performance due to the absence of a well-suited pre-trained Glove model for the Sinhala language.

References

2022

  1. seq2seq.png
    Sinhala Sentence Embedding: A Two-Tiered Structure for Low-Resource Languages
    Gihan Weeraprameshwara, Vihanga Jayawickrama, Nisansa Silva, and Yudhanjaya Wijeratne
    In Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation, 2022