Semantics of Data Mining Services in Cloud Computing


The recent incorporation of new Data Mining and Machine Learning services within Cloud Computing providers is empowering users with extremely comprehensive data analysis tools including all the advantages of this type of environment. Providers of Cloud Computing services for Data Mining publish the descriptions and definitions in many formats and often not compatible with other providers. From a functional point of view, having the possibility to describe complete Data Mining services is fundamental to maintain the usability and especially the portability of these services independently of the software/hardware support or even the differences between cloud platforms. The main objective of this paper is to design a Data Mining service definition which allows to compose with a single and simple definition a complete service, in such way a data mining workflow can be ported and deployed in different providers or even in a Market Place of this type of ready-to-consume services. This article presents a semantic scheme for the definition and description of complete Data Mining services considering both the management of the service by the provider (price, authentication, Service Level Agreement, …) and the definition of the Data Mining workflow as a service. It represents a solid contribution for paving the way to the standardization and industrialization of Data Mining services.To asses the validity of the scheme a list of services from Data Mining providers have been described and an example of a full service for a Random Forest algorithm has been defined as a service. In addition, a practical scenario has been developed, creating a deployment platform for Data Mining services to give functional support to the scheme, illustrating the practical benefits of the proposal for the end user.

Branch: CSE     Domain: Cloud Computing

Developed In: Java