
这一节的目的:
-
利用二分类区分正常邮件与垃圾邮件。
-
将邮件分为精细的类别。
5.3.1 朴素贝叶斯分类
于第二章用到的贝叶斯分类器一样,连测试数据都是一样的。
主程序:
package com.hankcs;
import iweb2.ch5.usecase.email.EmailClassifier;
import iweb2.ch5.usecase.email.data.Email;
import iweb2.ch5.usecase.email.data.EmailData;
import iweb2.ch5.usecase.email.data.EmailDataset;
public class ch5_1_EmailClass
{
public static void main(String[] args) throws Exception
{
// Create and train classifier
// 载入文档列表
EmailDataset trainEmailDS = EmailData.createTrainingDataset();
// 只考虑前10个最频繁的单词
EmailClassifier emailFilter = new EmailClassifier(trainEmailDS, 10);
emailFilter.train();
// Let's classify some emails from training set. If we can't get them right
// then we are in trouble 🙂
// 小规模地测试已知的两封邮件
Email email = null;
email = trainEmailDS.findEmailById("biz-04.html");
emailFilter.classify(email);
email = trainEmailDS.findEmailById("usa-03.html");
emailFilter.classify(email);
// Now, let's classify previously unseen emails
// 从未知的测试数据集中载入电子邮件
EmailDataset testEmailDS = EmailData.createTestDataset();
email = testEmailDS.findEmailById("biz-01.html");
emailFilter.classify(email);
email = testEmailDS.findEmailById("sport-01.html");
emailFilter.classify(email);
email = testEmailDS.findEmailById("usa-01.html");
emailFilter.classify(email);
email = testEmailDS.findEmailById("world-01.html");
emailFilter.classify(email);
email = testEmailDS.findEmailById("spam-biz-01.html");
emailFilter.classify(email);
}
}
输出:
P(i,c) = 0.06666666666666667, P(c) = 0.8823529411764706, P(i) = 0.06535947712418301 P(i,c) = 0.05555555555555555, P(c) = 0.11764705882352941, P(i) = 0.06535947712418301 Classified biz-04.html as NOT SPAM P(i,c) = 0.06666666666666667, P(c) = 0.8823529411764706, P(i) = 0.06535947712418301 P(i,c) = 0.05555555555555555, P(c) = 0.11764705882352941, P(i) = 0.06535947712418301 Classified usa-03.html as NOT SPAM P(i,c) = 0.13333333333333333, P(c) = 0.8823529411764706, P(i) = 0.12418300653594772 P(i,c) = 0.05555555555555555, P(c) = 0.11764705882352941, P(i) = 0.12418300653594772 Classified biz-01.html as NOT SPAM P(i,c) = 0.06666666666666667, P(c) = 0.8823529411764706, P(i) = 0.06535947712418301 P(i,c) = 0.05555555555555555, P(c) = 0.11764705882352941, P(i) = 0.06535947712418301 Classified sport-01.html as NOT SPAM P(i,c) = 0.05555555555555555, P(c) = 0.8823529411764706, P(i) = 0.055555555555555546 P(i,c) = 0.05555555555555555, P(c) = 0.11764705882352941, P(i) = 0.055555555555555546 Classified usa-01.html as NOT SPAM P(i,c) = 0.2, P(c) = 0.8823529411764706, P(i) = 0.18300653594771243 P(i,c) = 0.05555555555555555, P(c) = 0.11764705882352941, P(i) = 0.18300653594771243 Classified world-01.html as NOT SPAM P(i,c) = 0.05555555555555555, P(c) = 0.8823529411764706, P(i) = 0.10784313725490195 P(i,c) = 0.5, P(c) = 0.11764705882352941, P(i) = 0.10784313725490195 Classified spam-biz-01.html as SPAM
EmailClassifier 通过 EmailDataset 实例的 getTrainingSet 方法来获得它的训练集。载入这些邮件实例之后,分类器对其自身进行训练,以学习怎么基于训练集中的分配佶从把概念映射到实例。EmailClassifier在训练中并不会使用所有的邮件。它只使用一个属性,该属性的取值是在EmailInstance构造的时候生成的:
/**
* 创建实例
* @param emailCategory
* @param email
* @param topNTerms
*/
public EmailInstance(String emailCategory, Email email, int topNTerms)
{
super();
this.id = email.getId();
// email category is our concept/class
this.setConcept(new BaseConcept(emailCategory));
/**
* TODO: 5.3 -- Considering more attributes as part of the EmailInstance
*
* -- Separate "subject" and "body"
* -- timestamp
* -- "from"
* -- "to"
* -- "to" cardinality
*/
// extract top N terms from email content and subject
// 将邮件的标题和正文拼接起来
String text = email.getSubject() + " " + email.getTextBody();
// 找出前N个最频繁的词
Content content = new Content(email.getId(), text, topNTerms);
Map<String, Integer> tfMap = content.getTFMap();
attributes = new StringAttribute[1];
String attrName = "Email_Text_Attribute";
String attrValue = "";
for (Map.Entry<String, Integer> tfEntry : tfMap.entrySet())
{
attrValue = attrValue + " " + tfEntry.getKey();
}
// 创建唯一的属性,也就是找出前N个最频繁的词
attributes[0] = new StringAttribute(attrName, attrValue);
}
朴素贝叶斯分类器的细节
分类器会从训练实例中学习到实例与类别间的关联,然后给一个指定的实例提供与之相应的类别。其实贝叶斯算法在《智能Web算法》2.4 根据用户点击改进搜索结果早已接触过了,这里直接上注释代码,详细的解释还是看2.4吧。
package iweb2.ch5.classification.bayes;
import iweb2.ch5.classification.core.TrainingSet;
import iweb2.ch5.classification.core.intf.Classifier;
import iweb2.ch5.ontology.core.AttributeValue;
import iweb2.ch5.ontology.intf.Attribute;
import iweb2.ch5.ontology.intf.Concept;
import iweb2.ch5.ontology.intf.Instance;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
/**
* A basic implementation of the Naive Bayes algorithm.
* <p/>
* The emphasis is on teaching the algorithm, not optimizing its performance.
*
* @author babis
*/
public class NaiveBayes implements Classifier
{
/**
* NaiveBayes分类器实例的名称<br></br>
* You can use the NaiveBayes classifier in many occasions
* So, let's give it a name to identify the instance of the Classifier.
*/
private String name;
/**
* 训练集,一旦设定就不能更改,但是可以往训练集中添加新的实例<br></br>
* Every classifier needs a training set. Notice that both the name
* of the classifier and its training set are intentionally set during
* the Construction phase.
* <p/>
* Once you created an instance of the NaiveBayes classifier you cannot
* set its TrainingSet but you can always get the reference to it and
* add instances.
*/
protected TrainingSet tSet;
/**
* 训练集中各个概念的实例数量<br></br>
* These are the probabilities for each concept
*/
protected Map<Concept, Double> conceptPriors;
/**
* 条件概率,用户A查询Q时想看到X链接的概率<br></br>
* 或者观察到实例Y时在观察到概念X的概率
* This structure contains the fundamental calculation elements of
* the Naive Bayes method, i.e. the conditional probabilities.
*/
protected Map<Concept, Map<Attribute, AttributeValue>> p;
/**
* 训练时用到的属性<br></br>
* These are the attribute indices that we should consider for training
*/
protected ArrayList<String> attributeList;
/**
* 辅助变量<br></br>
* An auxiliary variable
*/
protected boolean verbose = false;
/**
* 唯一的构造方法,需要一个名称和一个训练集<br></br>
* The only constructor for this classifier takes a name and
* a training set as arguments.
*
* @param name the name of the classifier
* @param set the training set for this classifier
*/
public NaiveBayes(String name, TrainingSet set)
{
this.name = name;
tSet = set;
conceptPriors = new HashMap<Concept, Double>(tSet.getNumberOfConcepts());
verbose = false;
}
/**
* 返回发生概率最高的类别,作为对给定实例的分类结果。
* @param instance
* @return
*/
public Concept classify(Instance instance)
{
Concept bestConcept = null;
double bestP = 0.0;
if (tSet == null || tSet.getConceptSet().size() == 0)
{
throw new RuntimeException("You have to train classifier first.");
}
if (verbose)
{
System.out.println("\n*** Classifying instance: " + instance.toString() + "\n");
}
for (Concept c : tSet.getConceptSet())
{
double p = getProbability(c, instance);
if (verbose)
{
System.out.printf("P(%s|%s) = %.15f\n", c.getName(), instance.toString(), p);
}
if (p >= bestP)
{
bestConcept = c;
bestP = p;
}
}
return bestConcept;
}
/**
* Training simply sets the probability for each concept
* 负责分类器的训练
*/
public boolean train()
{
long t0 = System.currentTimeMillis();
boolean hasTrained = false;
// 确保至少有一个属性可用
if (attributeList == null || attributeList.size() == 0)
{
System.out.print("Can't train the classifier without specifying the attributes for training!");
System.out.print("Use the method --> trainOnAttribute(Attribute a)");
}
else
{
calculateConceptPriors();
calculateConditionalProbabilities();
hasTrained = true;
}
if (verbose)
{
System.out.print(" Naive Bayes training completed in ");
System.out.println((System.currentTimeMillis() - t0) + " (ms)");
}
return hasTrained;
}
public void trainOnAttribute(String aName)
{
if (attributeList == null)
{
attributeList = new ArrayList<String>();
}
attributeList.add(aName);
}
/**
* Strictly speaking these are not the prior probabilities but just the counts.
* However, we want to reuse these counts and the priors can be obtained by a simple division.
* 计算先验概率p(X),在这个实现中,我们实际记录的是频次值,概念的先验概率还需要用这个频次除以训练集中实例的总数而求得。
*/
private void calculateConceptPriors()
{
for (Concept c : tSet.getConceptSet())
{
//Calculate the priors for the concepts
// 通过遍历训练集中的所有实例,可以得到每个概念出现的频次。
int totalConceptCount = 0;
for (Instance i : tSet.getInstances().values())
{
if (i.getConcept().equals(c))
{
totalConceptCount++;
}
}
conceptPriors.put(c, new Double(totalConceptCount));
}
}
/**
* 计算某特定的属性值在一个概念中出现的频次
*/
protected void calculateConditionalProbabilities()
{
p = new HashMap<Concept, Map<Attribute, AttributeValue>>();
for (Instance i : tSet.getInstances().values())
{
for (Attribute a : i.getAtrributes())
{
if (a != null && attributeList.contains(a.getName()))
{
if (p.get(i.getConcept()) == null)
{
p.put(i.getConcept(), new HashMap<Attribute, AttributeValue>());
}
Map<Attribute, AttributeValue> aMap = p.get(i.getConcept());
AttributeValue aV = aMap.get(a);
if (aV == null)
{
aV = new AttributeValue(a.getValue());
aMap.put(a, aV);
}
else
{
aV.count();
}
}
}
}
}
/**
* 计算点击实例i属于概念c的后验概率,也即是用户查询i时的真实意图是c的概率<br></br>
* This method calculates the <I>posterior probability</I> that we deal with
* concept <CODE>c</CODE> provided that we observed instance <CODE>i</CODE>.
* This is the application of Bayes theorem.
*
* @param c 实例可能属于的概念 is a probable concept for instance <CODE>i</CODE>
* @param i 待检验实例 is the observed instance
* @return 后验概率 观察到示例<code>i</code>且属于概念<code>c</code> posterior probability of <CODE>c</CODE> given instance <CODE>i</CODE>
*/
public double getProbability(Concept c, Instance i)
{
double cP = 0;
// 如果训练集中已经有了这个概念,就可以用贝叶斯公式计算后验概率
if (tSet.getConceptSet().contains(c))
{
cP = (getProbability(i, c) * getProbability(c)) / getProbability(i);
System.out.println("P(i,c) = " + getProbability(i, c) + ", P(c) = " + getProbability(c) + ", P(i) = " + getProbability(i));
}
else
{
// We have never seen this concept before
// assign to it a "reasonable" value
// 新的概念,设为一个“合理”的值
cP = 1 / (tSet.getNumberOfConcepts() + 1.0);
}
return cP;
}
/**
* 获取观察到实例i的概率 This method calculates the denumerator of Bayes theorem
* @param i
* @return the probability of observing <CODE>Instance</CODE> i
*/
public double getProbability(Instance i)
{
double cP = 0;
for (Concept c : getTset().getConceptSet())
{
cP += getProbability(i, c) * getProbability(c);
}
return (cP == 0) ? (double) 1 / tSet.getSize() : cP;
}
/**
* 获取c的先验概率,根据训练集中该概念出现的次数取得
* @param c
* @return
*/
public double getProbability(Concept c)
{
Double trInstanceCount = conceptPriors.get(c);
if (trInstanceCount == null)
{
trInstanceCount = 0.0;
}
return trInstanceCount / tSet.getSize();
}
/**
* 已知用户想要的的是c,用户查询i的概率
* @param i
* @param c
* @return
*/
public double getProbability(Instance i, Concept c)
{
double cP = 1;
// 遍历实例的每个属性
for (Attribute a : i.getAtrributes())
{
// 如果是训练集要用到这个属性
if (a != null && attributeList.contains(a.getName()))
{
Map<Attribute, AttributeValue> aMap = p.get(c);
AttributeValue aV = aMap.get(a);
if (aV == null)
{
// the specific attribute value is not present for the current concept.
// Can you justify the following estimate?
// Can you think of a better choice?
// 从来没有观察到的概念
cP *= ((double) 1 / (tSet.getSize() + 1));
}
else
{
// 实例i的似然率等于观测到各个属性的概率的乘积
// 一个属性的概率=属性出现的次数/训练集中概念的实例数量
cP *= (aV.getCount() / conceptPriors.get(c));
}
}
}
return (cP == 1) ? (double) 1 / tSet.getNumberOfConcepts() : cP;
}
/**
* @return the name
*/
public String getName()
{
return name;
}
/**
* 获取训练集
* @return the tSet
*/
public TrainingSet getTset()
{
return tSet;
}
}
一个通用的邮件分类器
EmailClassifier继承自NaiveBayes,做了几点进化:
一是保证训练集被载入:
@Override
public boolean train()
{
// 确保训练数据已经被装载进来
if (emailDataset.getSize() == 0)
{
System.out.println("Can't train classifier - training dataset is empty.");
return false;
}
for (String attrName : getTset().getAttributeNameSet())
{
trainOnAttribute(attrName);
}
super.train();
return true;
}
二是对属性值匹配的模糊化,因为只有一个属性值且每个属性值都不一样,所以需要找到与a“最相似”的属性值,将其频数增加。这里的相似利用交集除并集的算法衡量:
/**
* 寻找属性a在属性集aMap中关键字重合最多的那个属性的值
* 因为属性是一个标题-关键词字符串,每个邮件都不一样,为了避免这种情况,用这个方法来匹配两个属性值
* @param aMap
* @param a
* @return
*/
private AttributeValue findBestAttributeValue(Map<Attribute, AttributeValue> aMap, Attribute a)
{
JaccardCoefficient jaccardCoeff = new JaccardCoefficient();
String aValue = (String) a.getValue();
String[] aTerms = aValue.split(" ");
Attribute bestMatch = null;
double bestSim = 0.0;
/*
* Here we only check representative attribute values. Other attribute values
* associated with representative attribute values will be
* ignored by this implementation.
*/
for (Attribute attr : aMap.keySet())
{
String attrValue = (String) attr.getValue();
String[] attrTerms = attrValue.split(" ");
double sim = jaccardCoeff.similarity(aTerms, attrTerms);
if (sim > jaccardThreshold && sim > bestSim)
{
bestSim = sim;
bestMatch = attr;
}
}
return aMap.get(bestMatch);
}
这个模糊化也应用于“已知用户想要的的是c,用户查询i的概率”的计算上:
@Override
public double getProbability(Instance i, Concept c)
{
double cP = 1;
for (Attribute a : i.getAtrributes())
{
if (a != null && attributeList.contains(a.getName()))
{
Map<Attribute, AttributeValue> aMap = p.get(c);
AttributeValue bestAttributeValue = findBestAttributeValue(aMap, a);
if (bestAttributeValue == null)
{
// the specific attribute value is not present for the current concept.
// Can you justify the following estimate?
// Can you think of a better choice?
cP *= ((double) 1 / (tSet.getSize() + 1));
}
else
{
cP *= (bestAttributeValue.getCount() / conceptPriors.get(c));
}
}
}
return (cP == 1) ? (double) 1 / tSet.getNumberOfConcepts() : cP;
}
然后条件概率里面省掉了一个循环,并且模糊了匹配:
@Override
protected void calculateConditionalProbabilities()
{
p = new HashMap<Concept, Map<Attribute, AttributeValue>>();
for (Instance i : tSet.getInstances().values())
{
// In this specific implementation we have exactly one attribute
// In general, you need a loop over the attributes
// 因为我们只有一个属性,就不用循环了
Attribute a = i.getAtrributes()[0];
Map<Attribute, AttributeValue> aMap = p.get(i.getConcept());
if (aMap == null)
{
aMap = new HashMap<Attribute, AttributeValue>();
p.put(i.getConcept(), aMap);
}
/**
* TODO: 5.3
*/
AttributeValue bestAttributeValue = findBestAttributeValue(aMap, a);
if (bestAttributeValue != null)
{
bestAttributeValue.count();
}
else
{
AttributeValue aV = new AttributeValue(a.getValue());
// register attribute as representative attribute
aMap.put(a, aV);
}
}
}
5.3.2 基于规则的分类
所谓规则就是if-then的编程子句,举个栗子:
when Email( $s : subject ) eval( classificationResult.isSimilar($s, "viagra" ) ) then classificationResult.setSpamEmail(true);
这个规则就是如果主题中含有“viagra”,就认定其为垃圾邮件。这些规则通常是通过手工或者半手工录入的。
DROOLS规则引擎
这是一个基于Java的规则引擎,是JBoss中的一个开源项目。Drools规则引擎由两个主要的模块构成:模式匹配模块与议程模块。模式匹配模块负责识别哪个规则可被事实所触发,一旦这些规则被识别出来,它们会被安放到议程模块中。Drools实现并扩展了Rete算法,不过这个算法不是《智能Web算法》讨论的重点。
Drools提供了一个脚本语言:
package demo; import iweb2.ch5.usecase.email.data.Email; import iweb2.ch5.classification.rules.ClassificationResult; global ClassificationResult classificationResult; rule "Tests for viagra in subject" salience 100 when Email( $s : subject ) eval( classificationResult.isSimilar($s, "viagra" ) ) then classificationResult.setSpamEmail(true); end rule "Tests for 'drugs' in subject" salience 100 when Email( $s : subject ) eval( classificationResult.isSimilar($s, "drugs" ) ) then classificationResult.setSpamEmail(true); end
这个脚本需要主程序载入运行:
package com.hankcs;
import iweb2.ch5.classification.rules.EmailRuleClassifier;
import iweb2.ch5.usecase.email.data.EmailData;
import iweb2.ch5.usecase.email.data.EmailDataset;
public class ch5_2_EmailRules
{
public static void main(String[] args) throws Exception
{
// Load email data
EmailDataset ds = EmailData.createTestDataset();
// Create classifier based on rules
// Expecting one spam email
// 创建了一个基于规则的邮件分类器
EmailRuleClassifier classifier = new EmailRuleClassifier("c:/iWeb2/data/ch05/spamRules.drl");
// 让分类器对其自身进行“训练”。对于基于规则的系统,规则就是知识,所以这种系统中“训练” 一步所需做的仅仅只是把规则从文件中载入进来。
classifier.train();
// 把数据集与一个描述性信息传递给分类器
classifier.run(ds,"Expecting one spam email. :-(");
// There should be no spam emails.
// Rule that checks for known email address should win over rules that detect spam content.
//
// classifier = new EmailRuleClassifier("c:/iWeb2/data/ch05/spamRulesWithConflict.drl");
//
// classifier.train();
//
// classifier.run(ds,"No spam emails here. Hurray!\n");
}
}
运行结果:
Expecting one spam email. 🙁 __________________________________________________ Classifying email: world-01.html ... Rules classified email: world-01.html as: NOT-SPAM Classifying email: spam-biz-01.html ... Invoked ClassificationResult.setSpamEmail(true) Rules classified email: spam-biz-01.html as: SPAM Classifying email: sport-01.html ... Rules classified email: sport-01.html as: NOT-SPAM Classifying email: usa-01.html ... Rules classified email: usa-01.html as: NOT-SPAM Classifying email: biz-01.html ... Rules classified email: biz-01.html as: NOT-SPAM __________________________________________________
spam-biz-01.html文件所代表的垃圾邮 件触发了分类器,因为其标题包含的单词“dugs”匹配上了垃圾规则,一旦条件被满足,该规则即可发动。
代码细节
规则引擎基于Drools,打通了drl脚本与Java的界限,看到这里面的运行时编译器,真的对Java的灵活性感到佩服。实现:
package iweb2.ch5.classification.rules;
import iweb2.ch5.usecase.email.data.Email;
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import java.util.Properties;
import org.drools.RuleBase;
import org.drools.RuleBaseFactory;
import org.drools.WorkingMemory;
import org.drools.compiler.PackageBuilder;
import org.drools.compiler.PackageBuilderConfiguration;
import org.drools.rule.Package;
public class RuleEngine
{
private RuleBase rules;
public RuleEngine(String rulesFile) throws RuleEngineException
{
try
{
// 读取规则文件
Reader source = new InputStreamReader(
new BufferedInputStream(new FileInputStream(rulesFile)));
// switch to JANINO compiler
// 切换到运行时编译器
Properties properties = new Properties();
properties.setProperty("drools.dialect.java.compiler",
"JANINO");
PackageBuilderConfiguration cfg =
new PackageBuilderConfiguration(properties);
// build a rule package
PackageBuilder builder = new PackageBuilder(cfg);
// parse and compile rules
// 解析规则
builder.addPackageFromDrl(source);
// 包含自定义的规则
Package pkg = builder.getPackage();
// 运行时的规则容器
rules = RuleBaseFactory.newRuleBase();
rules.addPackage(pkg);
} catch (Exception e)
{
throw new RuleEngineException(
"Could not load/compile rules from file: '" + rulesFile
+ "' ", e);
}
}
public void executeRules(ClassificationResult classificationResult, Email email)
{
WorkingMemory workingMemory = rules.newStatefulSession();
// 在drl脚本里使用Java对象!
workingMemory.setGlobal("classificationResult", classificationResult);
workingMemory.insert(email);
workingMemory.fireAllRules();
}
}
冲突解决
如果两个规则有冲突,比如一个规则认为E是垃圾邮件,另一个认为不是,或者规则A修改了事实,这个动作触发了B,B修改了事实又触发了A导致死循环。
Drools通过规则属性来解决冲突。
—个简单的垃圾邮件过滤规则(带冲突):
package demo;
import iweb2.ch5.usecase.email.data.Email;
import iweb2.ch5.classification.rules.ClassificationResult;
global ClassificationResult classificationResult;
rule "Rule 1: Tests for viagra in subject"
salience 100
# no-loop true
# lock-on-active true
when
email: Email( $s : subject )
eval( classificationResult.isSimilar($s, "viagra" ) )
then
email.setRuleFired(1);
classificationResult.setSpamEmail(true);
# update(email);
end
rule "Rule 2: Tests for 'drugs' in subject"
# no-loop true
# lock-on-active true
salience 100
when
email: Email( $s : subject )
eval( classificationResult.isSimilar($s, "drugs" ) )
then
email.setRuleFired(2);
# email.setSubject("processed");
# email.setFrom("processed");
classificationResult.setSpamEmail(true);
# update(email);
end
rule "Rule 3: Tests for known sender address"
# no-loop true
# lock-on-active true
salience 10
when
email: Email( $sender : from )
eval( classificationResult.isSimilar($sender, "friend@senderhost" ) )
then
email.setRuleFired(3);
classificationResult.setSpamEmail(false);
# update(email);
end
salience表示一个规则的重要程度,salience越小越重要。
输出:
Classifying email: world-01.html ... Rules classified email: world-01.html as: NOT-SPAM Classifying email: spam-biz-01.html ... Invoked Email.setRuleFired(2), current value ruleFired=0, emailId: spam-biz-01.html Invoked ClassificationResult.setSpamEmail(true) Invoked Email.setRuleFired(3), current value ruleFired=2, emailId: spam-biz-01.html Invoked ClassificationResult.setSpamEmail(false) Rules classified email: spam-biz-01.html as: NOT-SPAM Classifying email: sport-01.html ... Rules classified email: sport-01.html as: NOT-SPAM Classifying email: usa-01.html ... Rules classified email: usa-01.html as: NOT-SPAM Classifying email: biz-01.html ... Rules classified email: biz-01.html as: NOT-SPAM __________________________________________________
码农场