Social Robot for the Depressed and Lonely

Utilizing Emotional Analysis and LLM Technology

Posted by C, Li on October 20, 2023

Dataset

Finding multimodal data sets specifically provided by groups such as lonely people is complex, and because of the ethical implications, researchers should reduce or avoid such data collection to protect these populations.

Fortunately, emotion is an inherent and interoperable expression of human beings. This channel goes beyond language and action, and is widely present and similar in all kinds of groups. Therefore, multimodal data sets of ordinary people are equally effective for emotion analysis.

In the end, we found two datasets, one of which is AVEC dataset on patients with depression, which is more suitable for our topic, but currently our team has submitted the application and has not obtained the dataset. If it is not available, we will use the MOSI dataset provided by CMU, which is a multimodal dataset consisting of videos of normal people.

Encoder Structure

For the Encoder structure, for text input we will use BERT alone to extract features, for audio we will use LSTM architecture, and for visual factors we will temporarily use VGG16 of CNN architecture to extract features.

But these contents are temporary at present, we will give the definite results of operation in the next blog, because we will consider the effect of comparing the current updated model such as ViT, consider the replacement of architecture, and even consider the fusion of features to choose MulT or Contrastive Learning and Attention mechanisms. This is also why in the past two weeks, we have only selected encoders in addition to data sets rather than directly analyzed the results.

Generating Pythonic Code

In general, ChatGPT generates output in the form of a message for input by a user in the following format.

Figure 1: Example

In order to obtain a more precisely fixed type of output desired, it is necessary for ChatGPT to generate the output in the form of ‘Pythonic Code’ rather than the above message form. First, make the output always in the form of pythonic code, and then gradually revise the desired form of code through Prompt Engineering. The result of generating the same content as the above message in the form of pythonic code is as follows.

Figure 1: New

Now, we need to set the code structure optimized for the desired operation of Social Robot and then make ChatGPT generates an output with a predetermined code structure through prompt engineering.