Abstract:
As the number of social media users increases for platforms such as Twitter, Facebook, and Instagram, so does the number of but or spans accounts on these platforms. Typically, these hots or spam accounts are automated programmatically using the social media site's API and attempt to convey or spread a particular message. Some buts are designed for marketers trying to sell products or attract users to new sites. Other types of hots are much more malicious and disseminate misinformation that harms or tricks users. Such buts (fake accounts) may lead to serious consequences, as people's social network has become one of the determining factors in their general decision making. Therefore, these accounts have the potential to influence people's opinions drastically and hence real life events as well. Through different machine learning techniques, researchers have now begun to investigate ways to detect these types of malicious accounts automatically. To successfully differentiate between real accounts and hot accounts, a comprehensive analysis of the behavioral patterns of both types of accounts is required. In this paper, we investigate ways to select the best features from a data set for automated classification of different types of social media accounts (ex. hot versus real account) via visualization. To help select better feature combinations, we try to visualize which features may be more effective for classification using self organizing maps.