scala - Difference between org.apache.spark.ml.classification and org.apache.spark.mllib.classification

Question

Welcome To Ask or Share your Answers For Others

scala - Difference between org.apache.spark.ml.classification and org.apache.spark.mllib.classification

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

scala - Difference between org.apache.spark.ml.classification and org.apache.spark.mllib.classification

I'm writing a spark application and would like to use algorithms in MLlib. In the API doc I found two different classes for the same algorithm.

For example, there is one LogisticRegression in org.apache.spark.ml.classification also a LogisticRegressionwithSGD in org.apache.spark.mllib.classification.

The only difference I can find is that the one in org.apache.spark.ml is inherited from Estimator and was able to be used in cross validation. I was quite confused that they are placed in different packages. Is there anyone know the reason for it? Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T17:59:16+0000

It's JIRA ticket

And From Design Doc:

MLlib now covers a basic selection of machine learning algorithms, e.g., logistic regression, decision trees, alternating least squares, and k-means. The current set of APIs contains several design flaws that prevent us moving forward to address practical machine learning pipelines, make MLlib itself a scalable project.

The new set of APIs will live under org.apache.spark.ml, and o.a.s.mllib will be deprecated once we migrate all features to o.a.s.ml.

Categories

scala - Difference between org.apache.spark.ml.classification and org.apache.spark.mllib.classification

scala - Difference between org.apache.spark.ml.classification and org.apache.spark.mllib.classification

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags