cclust                package:cclust                R Documentation

_C_o_n_v_e_x _C_l_u_s_t_e_r_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     The data given by `x' is clustered by an algorithm.

     If `centers' is a matrix, its rows are taken as the initial
     cluster centers. If `centers' is an integer, `centers' rows of `x'
     are randomly chosen as initial values.

     The algorithm stops, if no cluster center has changed during the
     last iteration or the maximum number of iterations (given by
     `iter.max') is reached.

     If `verbose' is `TRUE', only for `"kmeans"' method, displays for
     each iteration the number of the iteration and the numbers of
     cluster indices which have changed since the last iteration is
     given.

     If `dist' is `"euclidean"', the distance between the cluster
     center and the data points is the Euclidian distance (ordinary
     kmeans algorithm). If `"manhattan"', the distance between the
     cluster center and the data points is the sum of the absolute
     values of the distances of the coordinates.

     If `method' is `"kmeans"', then we have the kmeans clustering
     method, which works by repeatedly moving all cluster centers to
     the mean of their Voronoi sets. If `"hardcl"' we have the On-line
     Update (Hard Competitive learning) method, which works by
     performing an update directly after each input signal, and if
     `"neuralgas"' we have the Neural Gas (Soft Competitive learning)
     method, that sorts for each input signal the units of the network
     according to the distance of their reference vectors to input
     signal.

     If `rate.method' is `"polynomial"', the polynomial learning rate
     is used, that means 1/t, where t stands for the number of input
     data for which a particular cluster has benn the winner so far. If
     `"exponentially decaying"', the exponential decaying learning rate
     is used according to par1*{(par2/par1)}^{(iter/itermax)} where
     par1 and par2 are the initial and final values of the l.rate.

     The parameters `rate.par' of the learning rate, where if
     `rate.method' is `"polynomial"' then by default rate.par=1.0,
     otherwise rate.par=(0.5,1e-5)

_U_s_a_g_e:

     cclust (x, centers, iter.max=100, verbose=FALSE, dist="euclidean",
             method= "kmeans", rate.method="polynomial", rate.par=NULL)

     print.cclust(cclust.obj)

_A_r_g_u_m_e_n_t_s:

       x: Data matrix where columns correspond to variables and rows to
          observations

 centers: Number of clusters or initial values for cluster centers

iter.max: Maximum number of iterations

 verbose: If `TRUE', make some output during learning

    dist: If `"euclidean"', then mean square error, if `"manhattan "',
          the mean absolute error is used

  method: If `"kmeans"', then we have the kmeans clustering method, if
          `"hardcl"' we have the On-line Update (Hard Competitive
          learning) method, and if `"neuralgas"', we have the Neural
          Gas (Soft Competitive learning) method. Abbreviations of the
          method names are accepted.

rate.method: If `"kmeans"', then k-means learning rate, otherwise
          exponential decaying learning rate. It is used only for the
          Hardcl method.

rate.par: The parameters of the learning rate.

_V_a_l_u_e:

     `cclust' returns an object of class `"cclust"'. 

 centers: The final cluster centers.

initcenters: The initial cluster centers.

ncenters: The number of the centers.

 cluster: Vector containing the indices of the clusters where the data
          points are assigned to.

    size: The number of data points in each cluster.

    iter: The number of iterations performed.

 changes: The number of changes performed in each iteration step with
          the Kmeans algorithm.

    dist: The distance measure used.

  method: The agorithm method being used.

rate.method: The learning rate being used by the Hardcl clustering
          method.

rate.par: The parameters of the learning rate.

    call: Returns a call in which all of the arguments are specified by
          their names.

withinss: Returns the sum of square distances within the clusters.

_A_u_t_h_o_r(_s):

     Evgenia Dimitriadou

_S_e_e _A_l_s_o:

     `plot.cclust', `predict.cclust'

_E_x_a_m_p_l_e_s:

     # a 2-dimensional example
     x<-rbind(matrix(rnorm(100,sd=0.3),ncol=2),
              matrix(rnorm(100,mean=1,sd=0.3),ncol=2))
     cl<-cclust(x,2,20,verbose=TRUE,method="kmeans")
     plot(cl,x)   

     # a 3-dimensional example 
     x<-rbind(matrix(rnorm(150,sd=0.3),ncol=3),
              matrix(rnorm(150,mean=1,sd=0.3),ncol=3),
              matrix(rnorm(150,mean=2,sd=0.3),ncol=3))
     cl<-cclust(x,6,20,verbose=TRUE,method="kmeans")
     plot(cl,x)

     # assign classes to some new data
     y<-rbind(matrix(rnorm(33,sd=0.3),ncol=3),
              matrix(rnorm(33,mean=1,sd=0.3),ncol=3),
              matrix(rnorm(3,mean=2,sd=0.3),ncol=3))
              ycl<-predict(cl, y)
              plot(ycl,y)

