0% found this document useful (0 votes)

32 views8 pages

SENTIPUBLIKO Sentiment Analysis of Repos

The document presents a study on sentiment analysis of Jejemon messages using a hybrid algorithm to classify tweets as positive, negative, or neutral. It discusses the unique characteristics of Jejemon language, the methodology for sentiment classification, and the effectiveness of various algorithms, achieving an accuracy of 71% to 78.5% in sentiment analysis. The study emphasizes the importance of understanding the sentiment behind Jejemon expressions in the context of social media communication.

Uploaded by

Shelomith Catbagan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views8 pages

SENTIPUBLIKO Sentiment Analysis of Repos

Uploaded by

Shelomith Catbagan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

ISSN 2278-3091

Volume Trends
Adomar L. Ilao et al., International Journal of Advanced 9 No.2,
in March - April
Computer Science2020
and Engineering, 9(2), March - April 2020, 1744 – 1751
International Journal of Advanced Trends in Computer Science and Engineering
Available Online at http://www.warse.org/IJATCSE/static/pdf/file/ijatcse128922020.pdf
https://doi.org/10.30534/ijatcse/2020/128922020

SENTIPUBLIKO: Sentiment Analysis of Repost Jejemon Messages using

Hybrid Approach Algorithm

Adomar L. Ilao1, Arnel C. Fajardo2

1
Technological Institute of the Philippines, Quezon City-Campus, Philippines, alilao@mcl.edu.ph
2
Technological Institute of the Philippines, Quezon City-Campus, Philippines, acfajardo@gmail.com


ABSTRACT Nevertheless, Jejemon language users suffered from weak
Jejemon language becomes a form of communication dialect. speaking ability and mangled word spelling as observed by
It was a form of expression used by a particular social group English teachers. They average 12 text messages everyday.
unknown as Jejemon. However, the Jejemon expression has Each jejemon message normally is composed of symbols and
different formats ranging from basic form of changing letter phonetics. However, the advent of technological
to number, lowercase letter to uppercase letter, inserting communication medium focuses on social connection such as
shortcut texts into more complicated format. This paper aims social media created new Jejemon followers in Twitter aside
to classify Jejemon tweet whether it is a positive, negative or from JEJETYPING, wearing a jeje-hat and jeje-photos online
neutral sentiment through sentiment analysis techniques. [4].
Experiment included translation of Jejemon formatted tweet,
reduction of sentiment scores on repost tweets and sentiment The overwhelming success of Jejemon language lead to an
classification. Analysis of experiment results involves Paired award as Word of the Year in 2010 by the Filipinas Institute of
T-Test, confusion matrix, precision, recall, f-score and Translation Incorporated based on significant impact on
accuracy. Evidently, translated Jejemon tweet resulted 78.5% Filipino life in terms of socio-cultural, political, social and
similar from the actual message using cosine similarity economic [5].
algorithm. Furthermore, Paired T-Test shows no significant
difference between new sentiment scores from translated As an influential millennial language, Jejemon expression is
expression and actual sentiment scores using Hybrid a result of self-expression which designed to resolve concerns
Algorithm. Sentiment analysis metrics such as precision, on limited available space provided by text messages and
recall, f-score and accuracy show acceptable values of 71%, Twitter [5]. However, it is significantly important beyond
76%, 71% and 73% respectively. translation to understand the actual feeling behind the
person’s message or opinion.
Key words: Sentiment Analysis, Social Media, Jejemon,
Hybrid Algorithm, Cosine Similarity, Dictionary Substitution Classification of one’s opinion can be done through sentiment
Approach, Tweet analysis. It can classify opinion whether positive, negative or
neutral via polarity score. Unfortunately, other factor to
1. INTRODUCTION consider is the impact of repost known as retweeting in
Twitter towards sentiment classification in a document level.
Human language is constantly changing. It happens across
time and social groups. These changes on human language This study is designed to evaluate the accuracy and precision
yields negative perception from people who are unable to cope Hybrid Algorithm developed by Ilao and Fajardo [6] applied
with new vocabulary or new visual representation. The to Jejemon Tweets. Furthermore, integration of string
downside of the new trends might result possible similarity algorithm in reducing sentiment polarity score for
miscommunication between social groups. repost or retweet messages will provide better understanding
of the side effect of repost messages in a document level
In the Philippines, several languages were invented to cater sentiment evaluation.
specific social groups. Predominantly, millennial and
member of the third sex are the most socially active group in
2. REVIEW OF RELATED LITERATURES
terms of modern language expression [1]. They developed
new language semantics such as Jejemon (p30pL3, o+h3r,
2.1 Jejemon Word Structure
pl4c3s) [2] and bikemon (Aglipay, Chiquito, Churchill).
However, the most controversial language trend in 2016 is
Jejemon language becomes popular 2010 based on the limited
Jejemon [3].
space available for text messages and tweets [5]. It is
primarily composed of alphabet known as Jejebet.

1744
Adomar L. Ilao et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1744 – 1751

Jejebet uses Roman alphabet, Arabic numerals and other

special characters namely 4, b, c, D, 3, f, 6, h, 1, j, k, 7, m, N,
0, p, Q, r, 5, t, u, V, w, x, Y, or z. Jejemon word is arranged in
alternating capitalization, over-usage of the letters H, X or Z
and mixture of numeric characters and English alphabet [5]
as shown in Table 1.

Table 1: Example of Jejemon Expression [5]

Characteristics of Example
Jejemon
Insertion of unnecessary phfue or p0w
numbers and letters.
Unique orthography eHyUoeW fPuoEh
based on how the words Figure 2: English Language to Jejemon Language Translation.
sound
Unconventional use of psenxa na ha!! 2.3 Dictionary Substitution Approach
punctuations
Numbers to substitute bzt4h Dictionary Substitution Approach is a technique commonly
letters identified as search and replace approach. The key
Alternate use of lower WE wnT 2 BE~ P0wh. characteristic is to match word from the corpus. However,
and upper case when multiple entries are found from the corpus, randomize
Use of onomatopoiec tnx pfowh jejeje word will be selected from list possible alternative [8].
lexis/emotional
language Jejemon language does not follow specific pattern. Through
Lengthening of vowels TAMAAA! Dictionary substitution approach it can replace non-standard
and consonants Jejemon token into a meaningful context of English or
Substitution of spelling Maq Filipino word.
2.2 Jejemon Translation
English or Filipino sentence found in Table 2 was translated
Jejemon translation is available online. One particular online to Jejemon equivalence from three sources. It shows three
translator is Jejemon Translator found in different techniques of Jejemon translation.
http://173.254.110.65/jejeschool/index.php. It is capable to
Table 2: Sample Online English Expression translated into Jejemon
translate English language to Jejemon expression as shown in
Expression
Figure 1.
English URL Source Jejemon
Sentence of Jejemon Expression
Translator
https://pinoych i wuD LLyK tO
ronicle.wordpr knOw moR3 bOut
ess.com/Jejem u. crE 2 t3ll mE yur
on/ N@me? jejejejeje!
Figure 1: English Language to Jejemon Language Translation.
Online 1 wUD 77yk +0
It was applied to transform the English thesis document Jejemon kn0w mUhr3
version [7] to jejenese or Jejemon language document version I would like to Translator 4b0U+ U', cr +0
available in iskwiki.upd.edu.ph. It shows the reliability of the know more (http://173.254 +377 m3 U'r
Jejemon translator as an effective technical tool. The jejemon about you, care .110.65/jejesch nm3?'' j3j3j3h3!
document revision demonstrated combination of three to tell me your ool/index.php)
Jejemon techniques namely numbers to substitute letters, name? Online i WOUlD Lyk To
symbols to substitute letters and alternate use of lowercase Hehehehe! Jejemon KNow more aboutz
and uppercase letters. Translator u, CeyRTo Tell me
(http://akosijai uR nMe, n0H?
An alternative online Jejemon translator can be found in rah.blogspot.c JEjEJEjE LOLz!
http://akosijairah.blogspot.com/2010/04/Jejemon-translator- om/2010/04/Je
v3_28.html as shown in Figure 2. The translated expression jemon-translat
is comprised of case conversion, p0wh insertion and or-v3_28.html
modification of word to a totally different spelling. )

1745
Adomar L. Ilao et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1744 – 1751

2.4 Lexicon-Based Algorithm Hybrid positive score = (SentiWordNet Positive (1)

Score + VADER Positive Score)/2
Lexicon-based Algorithm works by defining rules to classify Hybrid negative score =(SentiWordNet Negative (2)
the opinion which is created by tokenizing every sentence in Score + VADER Positive Score))/2
each document and testing if the token or word is present in Over-all Hybrid Score (OHS) = (Hybrid positive (3)
the database [8]. It is based on rule which is composed of score – Hybrid negative Score)
antecedent and consequent. An antecedent defines a
condition and consists of either a token or a sequence of Where:
tokens. This process provides a technique to single out If OHS >0, then sentiment is “Positive”.
positive, negative or neutral about the subjective opinion If OHS<0, then sentiment is “Negative”.
[9][10][11]. If OHS=0, then sentiment is “Neutral”.

It uses sentiment lexicon to assign a polarity value. A lexicon The number of occurrences of positive, negative and neutral
is comprised of words or phrase where each label is tweets will determine the over-all sentiment of the entire
categorized based on polarity value whether positive or population of collected political tweets. The hybrid approach
negative orientation [12]. In building a sentiment lexicon was experimented to different political datasets. The
have three strategies namely hand-craft elaboration, experiment yielded 88.33% accuracy better than
automatic expansion from an initial list of seed words and SentiWordNet and VADER algorithm.
corpus-based approach.
3. METHODOLOGY
A comparative study [13] on Lexicon-based review involving
AFINN, General Inquirer, Micro-WNOP, Opinion Lexicon,
The study was designed to implement sentiment analysis
SentiSense, SentiWordNet, Subjectivity Lexicon and
approach as illustrated in Figure 3.
WordNet-Affect. The investigation resulted 78% accuracy
towards SentiWordNet which utilizes WordNet corpus.

2.5 String Similarity Algorithm

String based similarity measurement defines the similarity of

strings in terms of the longest prefix common to both strings.
It is applied to several fields namely data cleaning, data
integration, error checking or pattern recognition [14]. It uses
either character-based or term-based technique [15]. The
commonly used string based similarity measurement. It is a
term-based technique known as edit distance algorithm which
performs minimum number of insertions, deletions or
substitutions to string1 to string 2 [14].

The comparative study on edit distance algorithms namely

Q-gram similarity, cosine similarity and dice coefficient
similarity. The study resulted in favor of cosine similarity
algorithm with an average accuracy of 63% [16]. Cosine
similarity algorithm measures two-finite-dimensional vectors
of the same dimension [15]. Furthermore, cosine similarity
was applied to analyze the similarity of sentiment scores from
SentiWordNet and the similarity of each sentiment score
contributed to product review rating prediction [17].

2.6 Hybrid Polarity Score Algorithm

Figure 3: Jejemon Repost Hybird Polarity Score Algorithm
Each synset polarity score is derived by computing the Conceptual Framework
average of SentiWordNet algorithm and VADER algorithm
as elaborated in Equation 1, Equation 2 and Equation 3 [18]. It involves several stages such as data cleaning, removal of
Normally, VADER polarity score use compound score to irrelevant words, dictionary substitution and loan translation
determine the sentiment classification. However, Ilao’s approach, tokenization, stemming, feature extraction, hybrid
Hybrid algorithm used VADER’s positive and negative scores sentiment polarity score approach (SentiWordNet Score,
to derived sentiment score.

1746
Adomar L. Ilao et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1744 – 1751

VADER score, Filipino Score, Recurrence Percentage 3.2 Feature Extraction

Reduction Score), and sentiment classification.
3.1 Pre-Processing 3.2.1 N-gram Approach

3.1.1 Data Cleaning The study will implement a hybrid approach in

classifying the given Jejemon expression whether
It involves elimination of unwanted expression such positive, negative and neutral sentiment. The two
as removal of punctuations, website URL, lexicon-based algorithms that was proven effective
emoticons, special characters and apostrophes. by Ilao’s study in 2019 combined SentiWordNet and
VADER algorithm polarity scores as shown in
3.1.2 Removal of Irrelevant Words Equation 1, Equation 2 and Equation 3 in which
uni-gram and tri-gram approach were implemented
It is intended to remove irrelevant contents such as by the said algorithms respectively.
slang words [19], stop words, Jejemon expression
(example: poWH, jejeje, xD) and non-Jejemon 3.2.2 Part-of-Speech (POS) Tagging
formatted text.
Each English lexeme will be tagged via StanFord
3.1.3 Dictionary Substitution and Loan Translation POS Tagger. Each tag will determine whether the
Approach lexeme can be a source of sentiment of
SentiWordNet.
Jejemon lexeme might contains abbreviation or
shortcut text in English or Filipino expression. A However, Filipino lexeme will not undergo POS
CSV file and Text file will stored shortcut English Tagging procedure. The study will used Tagalog
text, shortcut Filipino text and Jejemon most Corpus in identifying positive and negative political
sentimental words. When translated lexeme is found words.
from the seedlist, detected text will be replaced into
its actual value. It also involves translation of 3.3 Hybrid Approach
Jejemon expression to purely English expression,
purely Filipino expression and combination of both 3.3.1 SentiWordNet Polarity Score
languages.
SentiWordNet polarity score features positive,
Translation utilized enchant libraries [20] and negative and neutral scores. Nonetheless, Hybrid
collection of Filipino words from Tagalog dictionary approach requires positive and negative elements of
[21] to cross-checked spelling or possible word its derived polarity score.
suggestion in transforming complicated Jejemon
expression. 3.3.2 VADER Polarity Score

Furthermore, customized Tagalog dictionary was Classification of sentiment using VADER normally
constructed from collection of rubbish words of uses compound score. However, hybrid approach
English or Filipino words will resolve some will apply positive score and negative score as shown
ambiguity coming from Jejemon expression in Equation 1.0 and Equation 2.0.
translation.
3.3.3 Filipino Sentiment Polarity Score
3.1.4. Tokenization
Filipino sentiment polarity score will be applied for
The process of tokenization segments where Jejemon Filipino expression based on Equation 4.0.
expression might be in the form of paragraph,
sentences and word into a lexeme (single word). For every detected positive or negative word from
collected list of commonly used tagalog political
3.1.5 Stemming sentiment word will be scored as 1 point.

In this stage, Jejemon lexeme will subjected to Filipino Sentiment Polarity Score (FOSS) =
extraction of base word by simplifying plural form to (Number of Positive Words-Number of
(4)
singular or by removing prefix, suffix and infix. Negative Words)/Number of Words in the
Tweet

1747
Adomar L. Ilao et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1744 – 1751

English Jejemon tweet, purely Filipino Jejemon tweet and

Where: combination of both.
If FOSS >0, then sentiment is “Positive”.
If FOSS<0, then sentiment is “Negative”.
If FOSS =0, then sentiment is “Neutral”. Table 3: Sample of Jejemon Format Domain
Jejemon Format
Expresion
3.3.4 Cosine Similarity Domain
Online Resource GOveRnmEnt'Z Slow
Cosine Similarity algorithm will be used to (Controlled respOnSe TO d cOvID-19 hz
determine if there are some similarities between Environment) Led tO D ViRuz entErinG d
posts. If cosine similarity returns a string similarity pHilIpPinEs p0wh.
value of 70%, there exists a significantly identical Personal Preference i h8 tht d govERNmeNt
tweet between the lists of collected tweets which is (Uncontrolled prioRiTizE themsELVez
considered as a repost message. Environment) ratheR THn D PeopLe

3.3.5 Recurrence Percentage Reduction Score Table 4: Tweet Post Frequencies

Frequencie
Tweet Post Tweets
After string similarity algorithm identified s
similarity between tweets, Equation 5.0 and Single Post 116
Equation 6.0 will define reduction score from Repost 15
previously derived value and generate a new polarity
score for repost message. Table 5: Jejemon Collected Tweet used Language
Jejemon Tweet used Frequencie
Relative Frequency Rate = (number of item Languages s
(5) English 113
repost /total number of post messages)x100
Recurrence Percentage Reduction Score= Filipino 14
Over-all Hybrid Score - (Relative Frequency (6) Combination 4
Rate x Over-all Hybrid Score)
Table 6: Sentiment Classified Frequencies
3.4 Sentiment Classification Sentiment Frequencie
Classification s
The study was designed to perform sentiment analysis in a Positive 29
document level. Each Jejemon expression will be classified Negative 76
whether positive, negative or neutral. Afterward Neutral 26
classification, each identified sentiment will be counted
individually; the highest number of frequencies between 5. MEASUREMENT OF ALGORITHM PERFORMANCE
positive, negative or neutral will generally classify the
over-all sentiment of the collected instances. Algorithm performance will be validated in terms of paired
T-Test, accuracy, recall, f-score and precision using Equation
4. DATA SOURCE 7. Equation 8, Equation 9 and Equation 10 via confusion
matrix values as shown in Table 7.
The study prepared dataset collected from Twitters accounts
namely Jejemonilao and Jejemonkami from January 15,2020
Table 7: Confusion Matrix
up to February 29,2020 as stated on Table 3, Table 4, Table 5
Predicted
and Table 6 with 131 instances.
Negative Positive
(F) (T)
Dataset is consist of instances characterized of maximum
Negative FF FT
number of 37 tokens, minimum number of 3 tokens and an
Actua (F)
average of 12 tokens per instance.
l Positive TF TT
Jejemonilao tweets followed Jejemon format available in (T)
http://akosijairah.blogspot.com/2010/04/Jejemon-translator-
v3_28.html considered as controlled environment. However, Accuracy =
(7)
uncontrolled environment instances were extracted from
Jejemonkami tweets. Tweets were formatted based on Precision = (8)
preference of participants from any available Jejemon
patterns. Moreover, collected dataset is comprised of purely

1748
Adomar L. Ilao et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1744 – 1751

Recall= Table 9: Cosine Similarity Percentages per Language

(9)
Languages Lowest Highest
F-Score= Similarity Similarity
(10)
Percentages Percentages
English 23% 100%
Filipino 25% 100%
Where: Combinatio 40% 70%
FF: the frequency of correctly predicted negative emotion. n
FT: the frequency of incorrect predicted negative emotion. Table 10 shown the lowest similarity percentage of 23%
TF: the frequency of incorrect predicted positive emotion belongs to English expression. Several words of an English
TT: the frequency of correct predicted positive emotion. instance were translated wrongly brought by ambiguity from
the Jejemon expression namely missing letters were replaced
6. EXPERIMENT by different letters and capitalization of words. While similar
scenario happened to English and Filipino expression
The study evaluated the translation success rate of Jejemon (combination) where it gained the lowest value under highest
instances to counterpart language via string similarity similarity percentage of 70% from the series of languages.
algorithm presented in Table 8. Experiment encountered difficulties specially from missing
letters. List of suggested words that might be suitable to
Similarity Percentage is based cosine similarity value of replace rubbish word brought by missing letters comes from
translated expression against actual expression. Similarity English or Tagalog Dictionary. There are some instances
values falls from 70% and above has “Strong Similarity” where an English word was taken as a Filipino word or vice
while values falls below 70% has “Weak Similarity”. versa.

Table 8: Similarity Percentage of Translated Expression and Actual After Jejemon translation, the new dataset comprised of
Expression English, Filipino or combination underwent pre-selection. As
No. of No. of
Average stated on previous studies on machine translation on several
Jejemon
Strong Weak
Cosine Filipino dialects, accuracy rates are 70.67% [22] and 69.5%
Format
Similarit Similarit
Similarity [23]. Pre-selection of instances will be based on similarity
Domain
y (70% y
Percentag percentage between 70% up to 100%. Qualified instances are
and (Below
e revealed on Table 10.
Above) 70%) Table 10: New Instances after Pre-Selection
Online Languages Controlled Uncontrolled
Resource Environment Environment
(Controlled 92% 7% 81% English 33 64
Environment Filipino 0 2
) Combination 1 1
Personal Total 34 67
Preference
(Uncontrolle Table 10 shows 12% decreased of instances under the
73% 27% 76%
d controlled environment from 39 instances down to 34
Environment instances. Similarly, 27% decreased on uncontrolled
) environment instances from 92 instances down to 67
instances.
The controlled group’s translation similarity percentage
ranges from 23% up to 100%. The factors which affect Table 11: Paired T-Test of Computed Sentiment Scores
translation reflected Table 9 were word case, missing letters
that were omitted by Jejemon expression when translated lead
to a different English or Filipino expression due to ambiguity,
proper space between Jejemon expression and single/plural
forms. However, uncontrolled group’s similarity percentage
ranges 25% up to 100%. Significant difference from
controlled group, uncontrolled group’s factors are special
characters such as Ë, ë, $, @, or ü ; repetition of letters namely
“z”,”s”; two representation of 8 either “te” or “ate” , “i” either
“!” or “1”, “a” either “4” or “@” and Filipino word
interchangeably used “po” or “poh”.

1749
Adomar L. Ilao et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1744 – 1751

Table 11 described the relationship between different 79%. Lastly, the highest f-score value is 84% under the
sentiments’ scores namely new score is derived from negative sentiments of Uncontrolled Environment Domain
translated expression, original score is derived from the whereas the remaining values are within 60% up to 84%.
actual expression express either English, Filipino or
combination provided by participants. Furthermore, cosine The improved Ilao’s hybrid model provided accuracies of
similarity found instances with 70% similarity considered as 68% (Controlled Environment Domain), 79% (Uncontrolled
repost messages. All repost sentiments’ scores underwent Environment Domain) and an average of 74% from collected
either 1% or 3% reduction. As stated on Table 11, p-values Jejemon format domain.
resulted 0.821, 0.257 and 0.246 are greater than 0.05 implied
there is no significant difference between new scores and Generally, average scores of precision, recall, f-score and
original scores under controlled environment, uncontrolled accuracy resulted 70% and above in term of classification
environment and after application of recurrence percentage performance towards the different Jejemon format domain.
reduction scores.
7. CONCLUSION
Table 12: Confusion Matrix of Controlled Environment
Predicted This paper experimented on sentiment analysis solely focus
Negative Positive on Filipino fascination in expressing their idea through
(F) (T) Jejemon language.
Negative
11 1
Actua (F) However, Jejemon language raised issues on millennial
l Positive English proficiency. Jejemon language as a form of
7 6
(T) communication is normally expressed through short-text and
special representation of word or expression. Furthermore,
Jejemon expression offered several language structure
Table 13: Confusion Matrix of Uncontrolled Environment ranging simple to complicated techniques.
Predicted
Negative Positive In general, study of machine translation from Jejemon
(F) (T) expression into counterpart language namely English,
Negative Filipino or combination yielded successful conversion based
31 3 on the cosine similarity done. The similarity rates of
Actua (F)
l Positive translated expression against actual expression gained 76%
9 11 and 81% for controlled and uncontrolled environment
(T)
respectively.
Table 12 and Table 13 shown number of positive and
negatives instances classified correctly or incorrectly. It Even though some instances were not converted exactly as
appears out of 79 instances, 59 instances were classified compared to actual expression, paired T-Test shows no
correctly and 20 instances were classified incorrectly. significant difference between original polarity score and new
polarity score derived from the two dataset domain.
Table 14: Sentiment Analysis Metric Summary Furthermore, repost messages that underwent reduction
through recurrence percentage reduction technique do not
shows significant difference from the two domains based on
the available data.

Moreover, sentiment classification through hybrid approach

resulted positively in terms of consistency and accuracy based
on the average precision, recall and accuracy such as 71%,
76% and 73% respectively. Average F-score of 71% signifies
the acceptable classification performance since there is an
uneven representation of sentiment instances from collected
domains.

ACKNOWLEDGEMENT
Both negative sentiments resulted high precision with at least
91% and lowest precision value is 46% under Positive
Authors would like to express their greatest gratitude to
Controlled Environment. While, recall highest value is 86%
Graduate School of Technological Institute of the Philippines,
falls under positive sentiment of Controlled Environment
Quezon City for the support of realization of this paper
Domain; however the rest of the recall values are 61% up to
publication.
1750
Adomar L. Ilao et al., International Journal of Advanced Trends in Computer Science and Engineering, 9(2), March - April 2020, 1744 – 1751

REFERENCES 14. Lu, W., Du, X., Hadjieleftheriou, M., & Ooi, B. C.
(2014). Efficiently Supporting Edit Distance Based
1. Ramos, J. J. R. (2018). Bekimon in Social Networking String Similarity Search Using B+-Trees. IEEE
Transactions on Knowledge and Data
Site: Modernized Idiolect and Sociolect. International
Engineering, 26(12), 2983-2996.
Journal of Social Sciences & Humanities, 3(2), 16-25.
15. Khuat, Tung & Duc Hung, Nguyen & Thi My Hanh, Le.
2. Nocon, N., Cuevas, G., Magat, D., Suministrado, P., &
(2015). A Comparison of Algorithms used to Measure
Cheng, C. (2014, October). NormAPI: An API for
the Similarity between Two Documents. International
normalizing Filipino shortcut texts. In 2014
Journal of Advanced Research in Computer Engineering
International Conference on Asian Language Processing
& Technology (IJARCET). 4. 1117-1121.
(IALP) (pp. 207-210). IEEE.
https://doi.org/10.1109/IALP.2014.6973494 16. Chen, H. (2012). String Metric and Word Similarity
3. Mongaya, K.M. (2010). Applied to Information Retrieval (Doctoral
(http://karlomongaya.wordpress.com/2010/07/09/the- Dissertation, Master’s Thesis, School Of Computing.
University Of Eastern Findland).
internet-as -corrective-of-individualist-culture/)
17. Prasanna Lakshmi, K., Shraddha, V., Abhinava, V.,
4. The Social and Educational Influences of Jejemon
Kavya, K., & Gayathri, R. (2017). Sentiment Analysis
Texting Style.(2010) Date Retrieved: November 10,
And Prediction Using Text Mining. Indian Journal Of
2019
Science And Technology, 10 (28). Doi:
5. Tubac, Angelo. (2017). LINGUISTIC
10.17485/Ijst/2017/V10i28/113441
INNOVATIONS IN THE JEJEMON
18. Ilao, A. & Fajardo, A (2019). Sentiment Analysis of
PHENOMENON. 10.13140/RG.2.2.18434.07360.
Tweet Messages using Hybrid Approach Algorithm.
6. Ilao, A. & Fajardo, A. Sentiment Analysis of Tweet
17th International IEEE Conference of ICT and
Messages using Hybrid Approach Algorithm. In 2019,
Knowledge Engineering 2019
The 17th International Conference on ICT and
https://doi.org/10.1109/ICTKE47035.2019.8966887
Knowledge Engineering (ICT-KE). IEEE.
19. https://www.internetslang.com/trending.asp.
https://doi.org/10.1109/ICTKE47035.2019.8966887
Accessed Date : March 10,2020
7. Cataan, J. C. (2011). A ‘World’ Within the World: The
20. https://abiword.github.io/enchant/. Accessed Date:
Jejemons as the ‘Other’ Culture, Unpublished
March 12, 2020.
Undergraduate Thesis, University of the Philippines
College of Mass Communication. 21. https://github.com/raymelon/tagalog-dictionary-scra
per/blob/master/tagalog_dict.txt. Accessed Date:
8. Raghunathan, K., & Krawczyk, S. 2009. CS224N:
March 12, 2020.
Investigating SMS Text Normalization using
22. Domingo and R. Roxas. 2006. Utilizing Clues in
Statistical Machine Translation. Department of
Computer Science, Stanford University. Syntactic Relationships for Automatic Target Word
9. Devika, M. D., Sunitha, C., & Ganesh, A. (2016). Sense Disambiguation. Journal of Research for Science,
Computing and Engineering. 3(3), 18-24.
Sentiment Analysis: A Comparative Study on
https://doi.org/10.3860/jrsce.v3i3.99
Different Approaches. Procedia Computer Science, 87,
44-49. 23. Alcantara and A. Borra. 2008. Constituent Structure
https://doi.org/10.1016/j.procs.2016.05.124 for Filipino: Induction through Probabilistic
10. Romanyshyn, M. (2013). Rule-Based Sentiment Approaches. Proceedings of the 22nd Pacific Asia
Conference on Language, Information and Computation
Analysis of Ukrainian Reviews. International Journal
(PACLIC). 113-122.
of Artificial Intelligence & Applications, 4(4), 103
11. Kawathekar, S. A., & Kshirsagar, M. M. (2012).
Sentiments Analysis Using Hybrid Approach
Involving Rule-Based & Support Vector Machines
Methods. IOSRJEN, 2(1), 55-58.
https://doi.org/10.9790/3021-0215558
12. Almatarneh S, Gamallo P (2018) A Lexicon Based
Method to Search for Extreme Opinions. Plos One
13(5): E0197816. Https://DOI.Org/
10.1371/Journal.Pone.0197816
13. Cho, H., Lee, J. S., & Kim, S. (2013). Enhancing
Lexicon-Based Review Classification by Merging and
Revising Sentiment Dictionaries. In Proceedings of the
Sixth International Joint Conference on Natural
Language Processing (pp. 463-470).

1751

Thesis About Jejemon Language
100% (2)
Thesis About Jejemon Language
8 pages
The Social and Educational Influences of Jejemon Texting Style
No ratings yet
The Social and Educational Influences of Jejemon Texting Style
10 pages
JEJEMON
No ratings yet
JEJEMON
7 pages
Study Guide For Jejemon Language
No ratings yet
Study Guide For Jejemon Language
6 pages
The Impact of The Jejemon Sub-Culture To Philippine Language
100% (2)
The Impact of The Jejemon Sub-Culture To Philippine Language
6 pages
Jejemon
No ratings yet
Jejemon
6 pages
Jeje Mon
No ratings yet
Jeje Mon
4 pages
Literature Review On The Effect of Jejem
No ratings yet
Literature Review On The Effect of Jejem
7 pages
Jejemon
No ratings yet
Jejemon
4 pages
Jargoooon
No ratings yet
Jargoooon
16 pages
Impact of Jejemon Phenomenon On English Language
No ratings yet
Impact of Jejemon Phenomenon On English Language
4 pages
Jejemon Thesis
60% (15)
Jejemon Thesis
36 pages
Essay On Jejemons
No ratings yet
Essay On Jejemons
2 pages
Effects of Jejemon Texting On The Spelling and Grammar of 2nd Year Sec Students, S.Y 2010-2011
100% (2)
Effects of Jejemon Texting On The Spelling and Grammar of 2nd Year Sec Students, S.Y 2010-2011
4 pages
Jejemon
100% (1)
Jejemon
17 pages
Teen Language Innovations
No ratings yet
Teen Language Innovations
22 pages
Jejemon, Office Mate - NATE
No ratings yet
Jejemon, Office Mate - NATE
2 pages
Chapter 1 (Thesis)
No ratings yet
Chapter 1 (Thesis)
3 pages
Filipino Gay Lingo
No ratings yet
Filipino Gay Lingo
15 pages
Jejemon A Linguistic Variation
No ratings yet
Jejemon A Linguistic Variation
21 pages
Crafting a Jejemon Thesis PDF
100% (4)
Crafting a Jejemon Thesis PDF
5 pages
2nd Year
No ratings yet
2nd Year
1 page
Herbero Maravilla Thesis Ba-Comm Final
No ratings yet
Herbero Maravilla Thesis Ba-Comm Final
114 pages
Witchebelles Anata Magcharot Kay Mudra Na Nagsusuba Si Akech Developing A Rule Based Unidirectional Beki Lingo To Filipino Translator PDF
No ratings yet
Witchebelles Anata Magcharot Kay Mudra Na Nagsusuba Si Akech Developing A Rule Based Unidirectional Beki Lingo To Filipino Translator PDF
9 pages
Tho 2021 J. Phys. Conf. Ser. 1869 012084
No ratings yet
Tho 2021 J. Phys. Conf. Ser. 1869 012084
7 pages
Jejemon Fever: The Impact of Jejemon Phenomenon in The Academic Performance and Language Proficiency of Students
No ratings yet
Jejemon Fever: The Impact of Jejemon Phenomenon in The Academic Performance and Language Proficiency of Students
5 pages
Chapter 1
0% (1)
Chapter 1
7 pages
Filipino Language Subcultures
No ratings yet
Filipino Language Subcultures
3 pages
Filipino Language Evolution
100% (1)
Filipino Language Evolution
6 pages
Navigating The Influence of Social Media On Translingual Patterns of Cebuano Youth in An ESL Milieu
No ratings yet
Navigating The Influence of Social Media On Translingual Patterns of Cebuano Youth in An ESL Milieu
10 pages
INTRODUCTION
No ratings yet
INTRODUCTION
3 pages
3 3 Paper 10
No ratings yet
3 3 Paper 10
16 pages
For Peer Debriefing
No ratings yet
For Peer Debriefing
125 pages
FILIPINO LANGUAGE IN THE MOUTH OF ALPHA GEN A MIXED METHOD RESEARCH APPROACH Ijariie24562
100% (1)
FILIPINO LANGUAGE IN THE MOUTH OF ALPHA GEN A MIXED METHOD RESEARCH APPROACH Ijariie24562
8 pages
Topic Map V4
No ratings yet
Topic Map V4
14 pages
Knowledge and Language Impact
No ratings yet
Knowledge and Language Impact
9 pages
Impact of Facebook On Filipino Language
No ratings yet
Impact of Facebook On Filipino Language
31 pages
Impact of Facebook On Filipino Language
No ratings yet
Impact of Facebook On Filipino Language
31 pages
Best - Corpus Creation and Language.....
No ratings yet
Best - Corpus Creation and Language.....
24 pages
2 s2.0 85102515596
No ratings yet
2 s2.0 85102515596
5 pages
Bidirectional Agewigna (Himtana) - English Machine Translation Using Neural Network Machine Techniques
No ratings yet
Bidirectional Agewigna (Himtana) - English Machine Translation Using Neural Network Machine Techniques
8 pages
Twitter Data Preprocessing Guide
No ratings yet
Twitter Data Preprocessing Guide
8 pages
Jejemon Words - Andre Yestan Siapno
No ratings yet
Jejemon Words - Andre Yestan Siapno
2 pages
El-102 Language, Culture, and Society
100% (4)
El-102 Language, Culture, and Society
5 pages
Exploring The Filipinization of The English Language in A Digital Age: An Identity Apart From Other World Englishes
No ratings yet
Exploring The Filipinization of The English Language in A Digital Age: An Identity Apart From Other World Englishes
15 pages
Part-of-Speech Tagging System For Indian Social Media Text On Twitter
No ratings yet
Part-of-Speech Tagging System For Indian Social Media Text On Twitter
8 pages
Intergenerational Communication Through The Selected Gen Z Terminologies
No ratings yet
Intergenerational Communication Through The Selected Gen Z Terminologies
9 pages
Research 2 Pf2
No ratings yet
Research 2 Pf2
4 pages
Demorphy, German Language Morphological Analyzer
No ratings yet
Demorphy, German Language Morphological Analyzer
7 pages
Navigatingthe Influenceof Social Mediaon Translingual Patternsof Cebuano Youthinan ESLMilieu
No ratings yet
Navigatingthe Influenceof Social Mediaon Translingual Patternsof Cebuano Youthinan ESLMilieu
11 pages
Nur Ain Hasma - 230025301040
No ratings yet
Nur Ain Hasma - 230025301040
16 pages
PR2 (Reseach Question)
No ratings yet
PR2 (Reseach Question)
1 page
Filipino Language Dynamics
No ratings yet
Filipino Language Dynamics
7 pages
On Linguistic Aspects of Translation by PDF
No ratings yet
On Linguistic Aspects of Translation by PDF
5 pages
Slang Language Use in Social Media Among Malaysian Youths - A Sociolinguistic Perspective
No ratings yet
Slang Language Use in Social Media Among Malaysian Youths - A Sociolinguistic Perspective
12 pages
Sentiment Analysis in Code-Mixed Tweets
No ratings yet
Sentiment Analysis in Code-Mixed Tweets
17 pages
Generation Z Language and English Proficiency
No ratings yet
Generation Z Language and English Proficiency
20 pages
Sentiment Analysis for Code-Switch Text
No ratings yet
Sentiment Analysis for Code-Switch Text
8 pages
Divergence Capampañgan in and English Code
No ratings yet
Divergence Capampañgan in and English Code
19 pages
Brief History of Globalization
No ratings yet
Brief History of Globalization
2 pages
2nd Year Physics MCQs Whole Book by Ali Raza PDF
No ratings yet
2nd Year Physics MCQs Whole Book by Ali Raza PDF
56 pages
21 Whitacre Equus Trombone 1&2
No ratings yet
21 Whitacre Equus Trombone 1&2
7 pages
Ep Eia Seacom
100% (1)
Ep Eia Seacom
278 pages
NSTP 2 MMMM
No ratings yet
NSTP 2 MMMM
7 pages
Bản sao 08 - Two Populations Hypothesis Testing
No ratings yet
Bản sao 08 - Two Populations Hypothesis Testing
9 pages
IELTS Speaking Prep Guide
No ratings yet
IELTS Speaking Prep Guide
2 pages
Naive Bays Intrusion Detection
No ratings yet
Naive Bays Intrusion Detection
5 pages
Q4-W3 - Weekly-Home-Learning-Plan-for-Grade-2MAY 31 - JUNE 4
No ratings yet
Q4-W3 - Weekly-Home-Learning-Plan-for-Grade-2MAY 31 - JUNE 4
4 pages
Training Catalog Bobst Italia en 2023
No ratings yet
Training Catalog Bobst Italia en 2023
67 pages
Lotus Alarm & Key Fob Guide
No ratings yet
Lotus Alarm & Key Fob Guide
2 pages
Neural Machine Translation Advised by Statistical Machine Translation
No ratings yet
Neural Machine Translation Advised by Statistical Machine Translation
7 pages
Ashwin Project
100% (2)
Ashwin Project
15 pages
Data Collection Methods and Research Design
100% (1)
Data Collection Methods and Research Design
14 pages
Understanding Rubrics: A Guide
No ratings yet
Understanding Rubrics: A Guide
10 pages
Unit 3
No ratings yet
Unit 3
58 pages
The Impact of Positive and Negative Word of Mouth On Brand Choice (PDF Download Available) PDF
No ratings yet
The Impact of Positive and Negative Word of Mouth On Brand Choice (PDF Download Available) PDF
24 pages
Sample Questions L3 Module 4
100% (3)
Sample Questions L3 Module 4
7 pages
Topic - 4 Swing Trading
0% (1)
Topic - 4 Swing Trading
3 pages
Math Olympiad Problems
No ratings yet
Math Olympiad Problems
3 pages
Quick Start Guide To ExpressPCB
No ratings yet
Quick Start Guide To ExpressPCB
13 pages
Icmr-National Institute of Virology
No ratings yet
Icmr-National Institute of Virology
5 pages
History of Mudoch University, Au It Better Good One
100% (1)
History of Mudoch University, Au It Better Good One
48 pages
ATM Maintenance Log
No ratings yet
ATM Maintenance Log
270 pages
Geography Before MT Fall 2022
No ratings yet
Geography Before MT Fall 2022
10 pages
Agriculture Project
No ratings yet
Agriculture Project
26 pages
Motion Control Information System
No ratings yet
Motion Control Information System
312 pages
Internalisasi Core Value BerAKHLAK BPSDM Jatim - HO
100% (1)
Internalisasi Core Value BerAKHLAK BPSDM Jatim - HO
48 pages
Ver5 - 2023-2024 Modified CRLA Pre-Test
No ratings yet
Ver5 - 2023-2024 Modified CRLA Pre-Test
25 pages
Percy W. H. de Silva: P.O.Box 506, Muscat PC 100, Sultanate of Oman Tel. +968 24792338, +968 24746228, +968 99420621
No ratings yet
Percy W. H. de Silva: P.O.Box 506, Muscat PC 100, Sultanate of Oman Tel. +968 24792338, +968 24746228, +968 99420621
5 pages

SENTIPUBLIKO Sentiment Analysis of Repos

Uploaded by

SENTIPUBLIKO Sentiment Analysis of Repos

Uploaded by

ISSN 2278-3091

SENTIPUBLIKO: Sentiment Analysis of Repost Jejemon Messages using

Adomar L. Ilao1, Arnel C. Fajardo2

Jejebet uses Roman alphabet, Arabic numerals and other

Table 1: Example of Jejemon Expression [5]

2.4 Lexicon-Based Algorithm Hybrid positive score = (SentiWordNet Positive (1)

2.5 String Similarity Algorithm

String based similarity measurement defines the similarity of

The comparative study on edit distance algorithms namely

2.6 Hybrid Polarity Score Algorithm

VADER score, Filipino Score, Recurrence Percentage 3.2 Feature Extraction

3.1.1 Data Cleaning The study will implement a hybrid approach in

English Jejemon tweet, purely Filipino Jejemon tweet and

3.3.5 Recurrence Percentage Reduction Score Table 4: Tweet Post Frequencies

Recall= Table 9: Cosine Similarity Percentages per Language

Moreover, sentiment classification through hybrid approach

You might also like