| (矩阵的)逆 | inverse |
| (聚类)中心 | centroid |
| (梯度)收缩 | shrinkage |
| F1分数 | F1 score |
| k近邻 | k-neareast neighbor,KNN |
| k均值 | k-means |
| t分布随机近邻嵌入 | t-distribution stochastic neighbour embedding,t-SNE |
| 阿达马积 | Hadamard product |
| 凹函数 | concave function |
| 半正定 | positive semidefinite |
| 贝叶斯网络 | Bayes network |
| 标量 | scalar |
| 参数化模型 | parametric model |
| 测试集 | test set |
| 超参数 | hyperparameter |
| 池化 | pooling |
| 单位矩阵 | identity matrix |
| 点击率 | click through rate,CTR |
| 点积 | dot product |
| 丢弃层 | dropout |
| 独热编码 | one-hot encoding |
| 堆垛 | stacking |
| 对角矩阵 | diagonal matrix |
| 对数似然 | log-likelihood |
| 多层感知机 | multi-layer perceptron,MLP |
| 多数投票制 | majority voting |
| 多域独热编码 | multi-field one-hot encoding |
| 反向传播 | back propagation |
| 泛化能力 | generalization ability |
| 范数 | norm |
| 非参数化模型 | nonparametric model |
| 分类与回归树 | classification and regression tree,CART |
| 弗罗贝尼乌斯范数,F范数 | Frobenius norm |
| 概率矩阵分解 | probabilistic matrix factorization,PMF |
| 感受野 | receptive field |
| 感知机 | perceptron |
| 广义线性模型 | generalized linear model,GLM |
| 归纳偏置 | inductive bias |
| 过拟合 | overfitting |
| 函数间隔 | functional margin |
| 核函数 | kernel function |
| 核技巧 | kernel trick |
| 核矩阵 | kernel matrix |
| 黑塞矩阵 | Hessian matrix |
| 恒等函数 | identity function |
| 宏观F1分数 | macro-F1 score |
| 后剪枝 | post-pruning |
| 后验 | posterior |
| 互相关 | cross-correlation |
| 混淆矩阵 | confusion matrix |
| 机器学习 | machine learning |
| 基础学习器 | base learner |
| 基尼不纯度 | Gini impurity |
| 极大团 | maximal clique |
| 极限梯度提升 | extreme gradient boosting,XGBoost |
| 集成学习 | ensemble learning |
| 几何间隔 | geometric margin |
| 加性模型 | additive model |
| 假阳性 | false positive,FP |
| 假阴性 | false negative,FN |
| 间隔 | margin |
| 监督学习 | supervised learning |
| 降采样 | downsampling |
| 降维 | dimensionality reduction |
| 交叉熵 | cross entropy |
| 交叉验证 | cross validation |
| 精度 | precision |
| 径向基函数 | radial basis function,RBF |
| 矩阵 | matrix |
| 卷积 | convolution |
| 卷积神经网络 | convolutional neural network,CNN |
| 决策树 | decision tree |
| 均方根误差 | rooted mean squared error,RMSE |
| 卡鲁什-库恩-塔克条件,KKT条件 | Karush-Kuhn-Tucher conditions |
| 库尔贝克-莱布勒散度,KL散度 | Kullback-Leibler divergence |
| 拉格朗日函数 | Lagrangian function |
| 离散适应提升 | discrete AdaBoost |
| 岭回归 | ridge regression |
| 逻辑斯谛函数 | logistic function |
| 逻辑斯谛回归 | logistic regression |
| 逻辑提升 | logit boosting |
| 马尔可夫随机场 | Markov random fields |
| 马尔可夫网络 | Markov network |
| 曼哈顿距离 | Manhattan distance |
| 模型族 | model family |
| 内积 | inner product |
| 欧氏距离 | Euclidean distance |
| 配分函数 | partition function |
| 批量 | batch |
| 偏差-方差分解 | bias-variance decomposition |
| 平均平方损失 | mean squared error,MSE |
| 评分预测 | rating prediction |
| 朴素贝叶斯 | naive Bayes |
| 期望-最大化算法,EM算法 | expectation-maximazation algorithm |
| 奇异值分解 | singular value decomposition,SVD |
| 恰好拟合 | well fitting |
| 迁移学习 | transfer learning |
| 前剪枝 | pre-pruning |
| 前馈 | feedforward |
| 前向分步 | forward stagewise |
| 欠拟合 | underfitting |
| 强化学习 | reinforcement learning |
| 曲线下面积 | area under the curve,AUC |
| 人工神经网络 | artificial neural network,ANN |
| 熵 | entropy |
| 神经网络 | neural network,NN |
| 实适应提升 | real AdaBoost |
| 似然函数 | likelihood function |
| 势函数 | potential function |
| 适应提升 | adaptive boosting,AdaBoost |
| 受试者操作特征 | receiver operating characteristic,ROC |
| 输出层 | output layer |
| 输入层 | input layer |
| 双线性模型 | bilinear model |
| 随机森林 | random forest |
| 随机梯度下降法 | stochastic gradient decent,SGD |
| 特征分解 | eigendecomposition |
| 特征值 | eigenvalue |
| 梯度 | gradient |
| 梯度提升 | gradient boosting |
| 梯度提升决策树 | gradient boosting decision tree,GBDT |
| 梯度下降 | gradient decent |
| 提升 | boosting |
| 填充 | padding |
| 条件独立 | conditional independence |
| 凸函数 | convex function |
| 团 | clique |
| 微观F1分数 | micro-F1 score |
| 无监督学习 | unsupervised learning |
| 线性回归 | linear regression |
| 线性判别分析 | linear discriminant analysis,LDA |
| 线性整流单元 | rectified linear unit,ReLU |
| 相对熵 | relative entropy |
| 向量 | vector |
| 小批量梯度下降法 | mini-batch gradient decent,MBGD |
| 信念网络 | belief network |
| 信息增益 | information gain |
| 信息增益率 | information gain rate |
| 序列最小优化 | sequential minimal optimization,SMO |
| 学习率 | learning rate |
| 训练集 | training set |
| 验证集 | validation set |
| 一致流形逼近与投影 | uniform manifold approximation and projection,UMAP |
| 因子分解机 | factorization machine,FM |
| 隐含层 | hidden layer |
| 预训练模型 | pre-trained model |
| 元学习器 | meta learner |
| 召回率 | recall |
| 真阳性 | true positive,TP |
| 真阳性率 | true positive rate |
| 真阴性 | true negative,TN |
| 正定 | positive definite |
| 正交矩阵 | orthogonal matrix |
| 正则化 | regularization |
| 支持向量 | supporting vector |
| 支持向量机 | support vector machine,SVM |
| 秩 | rank |
| 主成分分析 | principal component analysis,PCA |
| 转置 | transpose |
| 桩 | stump |
| 准确率 | accuracy |
| 自举采样 | bootstrap sampling |
| 自举聚合 | bootstrap aggregation, bagging |
| 最大后验 | maximum a posteriori,MAP |
| 最大似然估计 | maximum likelihood estimation,MLE |
| 最小绝对值收缩算子 | least absolute shrinkage and selection operator,LASSO |