CovidTrends: Identifying Behaviors during the COVID-19 Pandemic – accepted paper

abril 22, 2021 § Deixe um comentário

Artigo aceito para apresentação, discussão e publicação na Trilha Principal do SBSI 2021: “CovidTrends: Identifying Behaviors during the COVID-19 Pandemic“, de Marcelo Loutfi, Marcelo Tibau, Sean Siqueira e Bernardo Nunes. Neste artigo, identificamos mudanças comportamentais de pessoas durante o período de pandemia de COVID-19, através da análise de termos pesquisados no Google e relação com as notícias. Três comportamentos que se destacaram foram: (i) preocupação inicial com o desabastecimento; (ii) percepção das pessoas em relação à nova realidade e (iii) a crença na bala de prata.

Algoritmo ético

abril 8, 2021 § Deixe um comentário

Descrever um sistema de tomada de decisão como um algoritmo é frequentemente uma maneira de desviar a responsabilidade das decisões humanas.

Para muitos, o termo implica um conjunto de regras baseadas objetivamente em dados ou evidências empíricas. Também sugere um sistema altamente complexo, tão complexo que uma pessoa teria dificuldades para entender seu funcionamento interno ou antecipar seu comportamento quando implantado. Bom, a realidade é que esta definição não é tão precisa assim.

Leia o texto completo em Update or Die. Publicada em 08 de abril de 2021.

Notas sobre estudos em IA e cognição

março 18, 2021 § Deixe um comentário

Entender melhor a mente humana para construir IAs mais robustas.

Apesar do tremendo progresso da inteligência artificial em áreas como tradução automática, classificação de objetos e reconhecimento da fala, a maioria dos sistemas de IA possui, ainda hoje, um foco extremamente restrito. Desenvolvedores da área (eu incluído) têm buscado estudos nas ciências cognitivas (como psicologia, linguística e filosofia) com o objetivo de entender melhor o funcionamento da mente humana e aplicar este conhecimento na construção de IAs mais robustas.

Leia o texto completo em Update or Die. Publicada em 18 de março de 2021.

Inteligência Artificial: surfando as incertezas

março 1, 2021 § Deixe um comentário

“Surfar as incertezas” é geralmente alcançada por meio de modelos preditivos. Em IA, eles tentam imitar o processo preditivo natural do ser humano.

Leia o texto completo em Update or Die. Publicada em 27 de Fevereiro de 2021.

Knowledge-Based Society (ou sociedade baseada no conhecimento)

fevereiro 19, 2021 § Deixe um comentário

Para se ter uma economia baseada no conhecimento e gerar inovação, é preciso que se estimule a educação, o conhecimento aplicado à propriedade intelectual e o multiculturalismo.

Leia o texto completo em Update or Die. Publicada em 12 de Fevereiro de 2021.

Think-Aloud Exploratory Search: Understanding Search Behaviors and Knowledge Flows

fevereiro 12, 2021 § Deixe um comentário

Tibau M., Siqueira S.W.M., Nunes B.P. (2021) Think-Aloud Exploratory Search: Understanding Search Behaviors and Knowledge Flows. In: Visvizi A., Lytras M.D., Aljohani N.R. (eds) Research and Innovation Forum 2020. RIIFORUM 2020. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-030-62066-0_23

Abstract: This paper describes an experiment that uses Concurrent Think-Aloud protocol (CTA) and person-to-person interviews to map searching behaviors and knowledge flows during search sessions. The findings are: (1) the most used searching strategy during exploratory searches was the “Metacognitive Domain”; and (2) online searching experts have a fair ability to deal with ideas prompted by browsing the search results. The main contributions of this research lie in the understanding of the process in which people find, access, decide what content is useful and apply online data to their different information needs.

Check out more at https://link.springer.com/chapter/10.1007/978-3-030-62066-0_23

Semantic Data Structures for Knowledge Generation in Open World Information System

novembro 11, 2020 § Deixe um comentário

ABSTRACT: As the amount of information grows exponentially online, Information Systems role in support knowledge flows encouraged by linked data increases as a driver to innovation, culture, business practices and people behavior. Web search engines are particularly affected by the open world challenges, notably as part of the growing digital ecosystems of networks and platforms of technology, media, and telecommunications (TMT) companies delivering personalized and customized services (e.g. Amazon in retailing, Uber in ride service hailing, food delivery, and bicycle-sharing system, and Airbnb in lodging). To recognize search intent drawn from user’s behavior allows to provide personalized search results. The work presented in this paper has the purpose of exploring methods to represent semantic relationships between concepts indexed by Web search engines in order to aid them recognize search intent and display results that meet the search intent. The performance of two different types of data structures based on entity-centric indexing was compared. The data structures were: a knowledge base that used an entity-centric mapping of Wikipedia categories and the KBpedia Knowledge Graph. Through analysis of entity ranking and linking, we detected that the Knowledge Graph could identify approximately three times more properties and relationships, which increases Web search engines capability to “understand” what is being asked.

Publication: SBSI’20: XVI Brazilian Symposium on Information Systems – November 2020 Article No.: 13 Pages 1–7 https://doi.org/10.1145/3411564.3411611

Covid-19 Predictions to Brazil

março 16, 2020 § 2 Comentários

total_cases_spreding_prediction61b

Figure 1: Plot from July 27 (daily death toll).

total_cases_spreding_prediction61

Figure 2: Plot from July 27 (daily spreading).total_cases_spreding_prediction-MT11

Figure 3: Plot from March 30 (daily spreading).

NEW UPDATE

Last update: August 3 – 12:30 p.m.

The plots above comprise data from World Health Organization (WHO) and was organized by Our World in Data (https://ourworldindata.org/coronavirus-source-data). From March, 18th to June, 7th I updated the data based on information provided by the brazilian Ministry of Health.  Since June, 8th my source is the media consortium (O Estado de S. Paulo, Extra, Folha, O Globo, G1 and UOL). They are from a predictive modeling I devised based on Taylor series (https://en.wikipedia.org/wiki/Taylor_series#Calculation_of_Taylor_series) using the first and second derivatives of the continuous approximation of the usage data.

Beginning on April 15 it will also be available the forecast for the number of deaths. Figure 1 shows this particular plot. More importantly though is the growth rate, particularly how fast the number of deaths is doubling. By July 31 the number of deaths will be around 92,438. If the forecast pattern continues as it is now, the doubling growth rate will be around 44 days.

Figure 2 shows the daily spreading forecast based on the last 24 hours . Figure 3 represents data updated on March 30th and contains the number of cases released at that point. I decided to keep it for comparison purposes only. It shows a S-shape (sigmoid growth curve), which I hope we soon see more often (and it actually appeared at that moment). A sigmoid curve represents an acceleration phase (meaning an exponential growth), with a change to linear growth (which would mean a stability  in the number of cases).  As we can see, the curve at Figure 2 rather than showing a J-shape, meaning that the rate of change (known as derivative) of the number of cases with respect to time is proportional to the number of cases itself, shows a S-shape which could indicate a curve flattening.

In plain mathematics, it is still growing at expfunc, an exponential function .

If the pattern continues as it is now, I’m forecasting that by July 31 it will be about 2,698,241 cases. Mortality rate (total deaths/total cases) around 3.42%. The range the authorities were working as the coronavirus peak in Brazil was May/June, with no particular date due to a sub notification problem which hinders an overview. As far as late July no new peak forecast was made. I created a repo  on GitHub with the whole csv file (last forecast data is July 31), click here to view.

P.S.: After July 31 I discontinued this model, hence there will not be new updates to this post. I will keep though a daily update of the number of cases and deaths at the GitHub repo (click here to view). Below, a list with the previous predictions.

Previous Predictions

  • Updated on July 27 and comprises data up to July 22: 2,698,241 (the actual number was 2,666,298) cases, with 92,438 (the actual number was 92,568) deaths on July 31.
  • Updated on July 23 and comprises data up to July 23: 2,424,331 cases, with 89,867 deaths on July 31.
  • Updated on July 14 and comprises data up to July 13: 1,976,451 cases (the actual number was 1,970,909), with 74,997 deaths (the actual number was 75,523) on July 15.
  • Updated on July 9 and comprises data up to July 8: 2,023,112 cases, with 79,436 deaths on July 15.
  • Updated on July 3 and comprises data up to July 2: 3,372,905 cases, with 107,837 deaths on July 15.
  • Updated on July 1st and comprises data up to June, 30th: 3,847,857 cases, with 149,084 deaths on July 15.
  • Updated on June 29th and comprises data up to June, 28th: 1,441,347 cases (the actual number was 1,408,485), with 60,550 deaths (the actual number was 59,656) on June, 30th.
  • Updated on June 23rd and comprises data up to June, 22nd: 1,656,690 cases, with 65,888 deaths on June, 30th.
  • Updated on June 19th and comprises data up to June, 18th: 1,773,163 cases, with 97,658 deaths on June, 30th.
  • Updated on June 15th and comprises data up to June, 14th: 1,526,412 cases, with 131,324 deaths on June, 30th.
  • Updated on June 8th and comprises data up to June, 7th: 2,594,616 cases, with 124,373 deaths on June, 30th.
  • Updated on June 2nd and comprises data up to June, 1st: 2,053,003 cases, with 80,983 deaths on June, 30th.
  • Updated on May 28th and comprises data up to May 27th: 533,236 cases (the actual number was 514,849), with 31,311 deaths (the actual number was 29,314) on May, 31st.
  • Updated on May 26th and comprises data up to May 25th: 454,764 cases on May, 31st.
  • Updated on May 25th and comprises data up to May 24th: 470,562 cases on May, 31st.
  • Updated on May 22nd and comprises data up to May 21st: 470,562 cases, with 55,177 deaths on May, 31st.
  • Updated on May 20th and comprises data up to May 19th: 901,663 cases, with 52,403 deaths on May, 31st.
  • Updated on May 18th and comprises data up to May 17th: 901,663 cases, with 82,480 deaths on May, 31st.
  • Updated on May 11th and comprises data up to May 10th: 646,436 cases, with 48,399 deaths on May, 31st.
  • Updated on May 8th and comprises data up to May 7th: 563,524 cases, with 26,914 deaths on May, 31st.
  • Updated on May 7th and comprises data up to May 6th: 413,014 cases, with 120,915 deaths on May, 31st.
  • Updated on May 5th and comprises data up to May 5th: 2,619,918 cases, with 175,398 deaths on May, 31st.
  • Updated on May 4th and comprises data up to May 4th: 110,501 cases (the actual number was 114,715) , with 7,565 deaths (the actual number was 7,921) on May, 5th.
  • Updated on May 1st and comprises data up to May 1st: 136,501 cases, with 7,972 deaths on May, 5th.
  • Updated on April 30th and comprises data up to April 30th: 135,495 cases, with 7,960 deaths on May, 5th.
  • Updated on April 29th and comprises data up to April 29th: 135,570 cases, with 14,158 deaths on May, 5th.
  • Updated on April 28th and comprises data up to April 28th: 125,474 cases, with 11,787 deaths on May, 5th.
  • Updated on April 27th and comprises data up to April 27th: 173,173 cases, with 14,640 deaths on May, 5th.
  • Updated on April 25th and comprises data up to April 25th: 247,286 cases, with 7,781 deaths on May, 5th.
  • Updated on April 24th and comprises data up to April 24th: 94,249 cases, with 8,157 deaths on May, 5th.
  • Updated on April 22th and comprises data up to April 22th: 148,040 cases, with 11,291 deaths on May, 5th.
  • Updated on April 20th and comprises data up to April 19th: 41,217 cases (the actual number was 40,581), with 2,658 deaths (the actual number was 2,575) on April, 20th.
  • Updated on April 17th and comprises data up to April 17th: 52,837 cases (with 3,009 deaths) on April, 20th.
  • Updated on April 15th and comprises data up to April 15th: 63,167 cases (with 2,931 deaths) on April, 20th.
  • Updated on April 14th and comprises data up to April 14th: 51,319 cases on April, 20th.
  • Updated on April 13th and comprises data up to April 13th: 37,947 cases on April, 20th.
  • Updated on April 11th and comprises data up to April 11th: 48,608 cases on April, 20th.
  • Updated on April 10th and comprises data up to April 10th: 50,880 cases on April, 20th.
  • Updated on April 9th and comprises data up to April 9th: 40,279 cases on April, 20th.
  • Updated on April 8th and comprises data up to April 8th: 78,107 cases on April, 20th.
  • Updated on April 6th and comprises data up to April 6th: 39,459 cases on April, 20th.
  • Updated on April 4th and comprises data up to April 4th: 11,572 cases on April, 5th (the actual number was 11,281) and 39,864 on April, 20th.
  • Updated on April 3rd and comprises data up to April 3rd: 11,608 cases on April, 5th and 41,958 on April, 20th.
  • Updated on April 2nd and comprises data up to April 2nd: 11,008 cases on April, 5th and 23,700 on April, 20th.
  • Updated on April 1st and comprises data up to April 1st: 11,600 cases on April, 5th and 35,522 on April, 20th.
  • Updated on March 31st and comprises data up to March, 31st: 25,654 cases on April, 5th and 228,893 on April, 20th.
  • Updated on March 30th and comprises data up to March, 30th: 6,352 cases on April, 5th and 9,542 on April, 20th.
  • Updated on March 27th and comprises data up to March, 27th: 12,670 cases on April, 5th and 47,113 on April, 20th.
  • Updated on March 25th and comprises data up to March, 25th: 2,843 cases on March, 26th (the actual number was 2,915) and 6,672 on April, 5th.
  • Updated on March 25th and comprises data up to March, 24th: 3,113 cases on March, 26th and 14,331 on April, 5th.
  • Updated on March 23rd and comprises data up to March, 23rd: 3,999 cases on March, 26th and 24,311 on April, 5th.
  • Updated on March 22nd and comprises data up to March, 22nd: 5,622 cases on March, 26th and 32,784 on April, 5th.
  • Updated on March 21st and comprises data up to March, 20th: 4,929 cases on March, 26th and 20,558 on April, 5th.
  • Updated on March, 20th and comprises data up to March, 19th: 2,255 cases on March, 26th and 5,601 on April, 5th.
  • Dataset downloaded on March, 18th and comprises data up to March, 17th: 2,690 cases on March, 26th.
  • Dataset downloaded on March, 16th and comprising data up to March, 16th: 4,554 cases on March, 26th.

Reordering Search Results to Support Learning

fevereiro 17, 2020 § Deixe um comentário

sorting

Teixeira C.P., Tibau M., Siqueira S.W.M., Nunes B.P. (2020) Reordering Search Results to Support Learning. In: Popescu E., Hao T., Hsu TC., Xie H., Temperini M., Chen W. (eds) Emerging Technologies for Education. SETE 2019. Lecture Notes in Computer Science, vol 11984. Springer, Cham.

Abstract: Although many learning activities involve search engines, their ranking criteria are focused on providing factual rather than procedural information. In the context of Searching as Learning, providing factual information may not be the best approach. In this paper, we discuss the relevance criteria according to traditional learning theories to support search engine results reordering based on content suitability to learning purposes. We proceeded on the investigation by selecting some self-proclaimed search literacy experts to answer thoroughly questions about their views on the reordered results. We take into account that literacy expert’s judgment may reveal issues regarded to technical side on learning supported by search tools. Experienced users claimed a preference for reliable sources and direct answers to what they are looking for, as they have exploratory skills to overcome information incompleteness.

DOI: https://doi.org/10.1007/978-3-030-38778-5_39

A comparison between Entity-Centric Knowledge Base and Knowledge Graph to Represent Semantic Relationships for Searching as Learning Situations

novembro 25, 2019 § Deixe um comentário

knowledge-graph-database-1

TIBAU, Marcelo; SIQUEIRA, Sean; NUNES, Bernardo Pereira. A comparison between Entity-Centric Knowledge Base and Knowledge Graph to Represent Semantic Relationships for Searching as Learning Situations. Anais dos Workshops do Congresso Brasileiro de Informática na Educação, [S.l.], p. 823, nov. 2019. ISSN 2316-8889. Disponível em: <https://br-ie.org/pub/index.php/wcbie/article/view/9032>.doi: http://dx.doi.org/10.5753/cbie.wcbie.2019.823.

Abstract: Searching the web with learning intent, known as Searching as Learning (SaL), consists on learners to use Web search engines as a technology to drive their learning process. However, it may be difficult to users to find out relevant information online due to an inability to accurately specify their information need, a situation known as Anomalous State of Knowledge (ASK). To minimize the ASK situation, the continuous flow of data gathering and interaction between user and the search results could be used by search engines to tailor learning-intent search experience. It requires Web search engines to identify such intent and they may use linked data, Knowledge Bases and Graph Databases in order to recognize the meaning of query terms and keywords and use them to predict learning intent. In order to explore the possibility of semantic data structures to represent knowledge that could aid a learning-driven Web search engine to recognize learning intention from user’s queries, the present paper compared the performance of two different types of data structures based on entity-centric indexing to identify properties and semantic relationships. One was a knowledge base that used a entity-centric mapping of Wikipedia categories and the other was the KBpedia Knowledge Graph. The entity ranking and linking of both were analyzed and we discovered that the knowledge graph could identify about three times more properties and relationships.