Documents Clustering Using K-Means Algorithm
Abstract
Nowadays in the digital era, people could easily access and stored a wide range of information through the Internet into documents. With the huge number of unstructured documents with various type of information in digital storage, people need an application that could help them organize and classify the documents automatically. Documents Clustering using K-Means Algorithm is a desktop-based documents clustering application which implement K-Means Algorithm to provides clustering output based on the documents content similarity up to 85% accuracy based on the user expectation.
Full Text:
PDFReferences
Ambler, S. W. (n.d.). Agile Modeling. Retrieved March 15 2017, from UML 2 Use Case Diagrams: AN Agile Introduction: http://www.agilemodeling.com/artifacts/useCaseDiagram.htm
Erb, E. (n.d.). Github. Retrieved April 20, 2017, from Document Clustering Program in Java: https://github.com/ezraerb/DocumentCluster
File: K Means Example Step 1.svg. (n.d.). Retrieved March 20, 2017, from Wikipedia: https://en.wikipedia.org/wiki/File:K_Means_Example_Step_1.svg
File: K Means Example Step 2.svg. (n.d.). Retrieved March 20, 2017, from Wikipedia: https://en.wikipedia.org/wiki/File:K_Means_Example_Step_2.svg
File: K Means Example Step 3.svg. (n.d.). Retrieved March 20, 2017, from Wikipedia: https://en.wikipedia.org/wiki/File:K_Means_Example_Step_3.svg
Information Retrieval. (n.d.). Retrieved March 22, 2017, from http://www.doc.ic.ac.uk/~nd/surprise_97/journal/vol4/hks/inf_ret.html
Jajoo, P. (2008). Document Clustering. Retrieved February 5, 2017
K-Means Clustering. (n.d.). Retrieved February 4, 2017, from A Tutorial on Clustering Algorithms: https://home.deib.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html
Kunwar, S. (n.d.). Text Documents Clustering using K-Means Algorithm. Retrieved October 20, 2016, from Code Project: https://www.codeproject.com/Articles/439890/Text-Documents-Clustering-using-K-Means-Algorithm
Osinski, S., & Weiss, D. (n.d.). Carrot2. Retrieved April 27, 2017, from Carrot2 Search: http://search.carrot2.org/stable/search
Osinski, S., & Weiss, D. (n.d.). Carrot2 User and Developer Manual. Retrieved April 27, 2017, from Carrot2: http://download.carrot2.org/head/manual/index.html#chapter.introduction
Rose, B. (n.d.). Document Clustering with Python. Retrieved March 19, 2017, from http://brandonrose.org/clustering
Shah, N., & Mahajan, S. (2012). Document Clustering: A Detailed Review. International Journal of Applied Information Systems (IJAIS). Retrieved February 6, 2017
Sousa, S. d. (n.d.). The Advantages and Disadvantages of RAD Software Development. Retrieved October 4, 2016, from Susan de Sousa's My PM Expert: www.my-project-management-expert.com/the-advantages-and-disadvantages-of-rad-software-development.html
Teknomo, K. (n.d.). Difference of Cluster Analysis and Discriminant Analysis. Retrieved February 2, 2017, from Revoledu: http://people.revoledu.com/kardi/tutorial/LDA/Cluster%20and%20discriminant%20analysis.html
Teknomo, K. (n.d.). Discriminant Analysis Tutorial. Retrieved February 2, 2017, from Revoledu: http://people.revoledu.com/kardi/tutorial/LDA/
Teknomo, K. (n.d.). Euclidean Distance. Retrieved March 23, 2017, from Revoledu: http://people.revoledu.com/kardi/tutorial/Similarity/EuclideanDistance.html
Teknomo, K. (n.d.). How the K-Mean Clustering algorithm works? Retrieved March 1, 2017, from Revoledu: http://people.revoledu.com/kardi/tutorial/kMean/Algorithm.htm
Teknomo, K. (n.d.). What is Clustering? Retrieved January 30, 2017, from Revoledu: http://people.revoledu.com/kardi/tutorial/Clustering/clustering.htm
Teknomo, K. (n.d.). What is K-Mean Clustering? Retrieved January 31, 2017, from Revoledu: http://people.revoledu.com/kardi/tutorial/kMean/WhatIs.htm
toletol, K. (n.d.). Rapid Application Development (RAD) Model. Retrieved October 6, 2016, from Wikipedia: https://en.wikipedia.org/wiki/File:RADModel.JPG
What does tf-idf mean? (n.d.). Retrieved March 23, 2017, from http://www.tfidf.com/
Zong, J. (n.d.). K Means Clustering with Tf-idf Weights. Retrieved March 10, 2017, from http://jonathanzong.com/blog/2013/02/02/k-means-clustering-with-tfidf-weights
DOI: http://dx.doi.org/10.33021/itfs.v3i02.589
Refbacks
- There are currently no refbacks.
Copyright (c) 2019 IT for Society
All articles in this journal are indexed in:
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.