??xml version="1.0" encoding="utf-8" standalone="yes"?>亚洲精品第一国产综合野,一区二区亚洲精品精华液,久久精品亚洲AV久久久无码http://www.tkk7.com/changedi/category/47324.html先知cd——热qzL一切艺术的开?/description>zh-cnTue, 12 Nov 2013 06:17:38 GMTTue, 12 Nov 2013 06:17:38 GMT60- 基数估计http://www.tkk7.com/changedi/archive/2013/11/12/406235.htmlchangedichangediTue, 12 Nov 2013 02:10:00 GMThttp://www.tkk7.com/changedi/archive/2013/11/12/406235.htmlhttp://www.tkk7.com/changedi/comments/406235.htmlhttp://www.tkk7.com/changedi/archive/2013/11/12/406235.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/406235.htmlhttp://www.tkk7.com/changedi/services/trackbacks/406235.html问题的背景是在大数据冲击下,很多数据指标Q尤其是涉及(qing)到去重的Q的计算无法在合理的I间和时间内完成Q比如uv的计,数学原型问题{h(hun)于持l的向一个集合中写数Q重复的不记Q要求最l给出集合中不重复的元素的个敎ͼ集合的势Q。而比较暴力的做法是随着数字增多不断的扩展集合的大小Q让它放下所有的敎ͼ最l数?gu)个个数就O(jin)K。显然这L(fng)I间复杂度在单机下是做不到的Q所以多数做法是利用分布式原理将uv数据隔离C同的计算节点Q每个计节点自行维护一个类DL(fng)集合Qwdm实时里的布隆qo(h)器)(j)Q然后分而治之,最后mergeZ份结果数据?
基数估计的初衷就是ؓ(f)?jin)解军_大数据的前提下,如何以低成本的空间复杂度去计超大集合的势的问题Q换句话_(d)通过基数估计Q单机做到计亿U别uvQ误差在4%以内。解x(chng)\主要是概率估计,具体原理和做法参?blog和论文原文?
Z实验的目的,我简单实C(jin)暴力做法bruteforce-bfQ布隆过滤器-bbfQloglog-llc和hyperloglog-hllc四个法Q比较一下基C计这个计去重指标的逻辑是否可行Qllc非常谱Q可能是我分桶数没有调整好,׃贴出l果?jin)?j)?
预处理方法:(x)1-N生成随机uidQ模拟Nơ(均匀分布Q,jvm启动-Xmx1024m?
实验l果Q?
附加说明一下,期望值如何计:(x)其实q个实验的数学原型就是一个长度ؓ(f)k的均匀分布的(1-N)的随机数列,求不重复的元素个数的期望。我实验里k=nQ这是一U极端情况(实验设计Uؓ(f)方便计算Q如果k较大?x)导致计超慢,uv5000w时根本无法计出来,增大k理论上会(x)提高_ֺQ我实验q的一l数据是100w uv 500wpv?hllc的值是991234Q误?lt;1%Q,理论上k相当于pvQ在递推公式中k于无穷时期望等于n?
q个递推的计可以通过l合分析推导Q推导方法不详说?jin)(当然我有可能推导错?jin)~~数学功底 实在 不行?jin)?j)Q通项公式见matlab代码?
syms e n;
e = n-(1/n)*((1-2*n+n*n)*((n-1)/n)^(n-2)+(1-n)*n+n*(n-1));
vpa(subs(e,'n',1000000),10)
另外Q我个h认ؓ(f)分布式布隆过滤器的方案是非常好的Q因为空间和旉都比较均衡,且精度高,基数估计的方法本质上I间复杂度O(1)Q时间复杂度代码高效一点也可以非常快,但是~点是精度E微?hu)Ơ缺Q且不易分布式计(因ؓ(f)它天生适合单进E,llc分桶均衡也是单进E做比较好,分布式完全是牛刀杀鸡)(j)?
ref blog: http://blog.codinglabs.org/articles/cardinality-estimate-exper.html#ref4
法实现的java代码可见githubQ?https://github.com/changedi/card-estimate

]]>- Commons Math学习(fn)W记——聚cd回归http://www.tkk7.com/changedi/archive/2011/01/01/342124.htmlchangedichangediSat, 01 Jan 2011 10:35:00 GMThttp://www.tkk7.com/changedi/archive/2011/01/01/342124.htmlhttp://www.tkk7.com/changedi/comments/342124.htmlhttp://www.tkk7.com/changedi/archive/2011/01/01/342124.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/342124.htmlhttp://www.tkk7.com/changedi/services/trackbacks/342124.html
回归是一个统计中非常重要的概念了(jin)。在Commons Math库中有一个regression的子包{么实C(jin)U性回归的一些基本类型。在regression包中Q有个基本接口就是MultipleLinearRegressionQ这个接口表达y=X*b+uq样的基本线性回归式。线性回归是利用UCؓ(f)U性回归方E的最二乘函数对一个或多个自变量和因变量之间关p进行徏模的一U回归分析。简单看q个公式Qy代表?jin)一个nl的列向量(回归子)(j)QX代表?jin)[n,k]大小的观值矩阵(回归量)(j)Qb是kl的回归参数Qu是一个nl的剩余误差。回归分析干什么用的?具体讲就是预。我们在数据挖掘?sh)定义,定性的分析叫做分类Q而定量的分析叫做回归。回归就是根据已有的观察值去预测未来的一个定量的指标。记得前一D阿里云到学院来做技术交,讲到阉K和淘宝通过数据分析对中国商品交易(q是具体什么N易,忘记?jin),尬Q的预测是工程师做的一个简单的U性回归分析,模型虽然单,但是后来与实际数据一比较Q预g实际值的曲线基本d?
阅读全文
]]> - Commons Math学习(fn)W记——随机生成和l计初步 http://www.tkk7.com/changedi/archive/2011/01/01/342123.htmlchangedichangediSat, 01 Jan 2011 10:30:00 GMThttp://www.tkk7.com/changedi/archive/2011/01/01/342123.htmlhttp://www.tkk7.com/changedi/comments/342123.htmlhttp://www.tkk7.com/changedi/archive/2011/01/01/342123.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/342123.htmlhttp://www.tkk7.com/changedi/services/trackbacks/342123.html阅读全文

]]> - Commons Math学习(fn)W记——分数和复数http://www.tkk7.com/changedi/archive/2010/12/27/341639.htmlchangedichangediMon, 27 Dec 2010 14:00:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/27/341639.htmlhttp://www.tkk7.com/changedi/comments/341639.htmlhttp://www.tkk7.com/changedi/archive/2010/12/27/341639.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/341639.htmlhttp://www.tkk7.com/changedi/services/trackbacks/341639.html
阅读全文
]]> - Commons Math学习(fn)W记——分?/title>http://www.tkk7.com/changedi/archive/2010/12/23/341408.htmlchangedichangediThu, 23 Dec 2010 12:03:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/23/341408.htmlhttp://www.tkk7.com/changedi/comments/341408.htmlhttp://www.tkk7.com/changedi/archive/2010/12/23/341408.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/341408.htmlhttp://www.tkk7.com/changedi/services/trackbacks/341408.html
在Commons Math包中也专门有一个子包对概率分布q行?jin)封装实现。在distribution包中Q定义了(jin)一个基本接口Distribution。该接口只有两个Ҏ(gu)Q一个是double cumulativeProbability(double x)Q一个是double cumulativeProbability(double x0, double x1)。前者对于服从某U分布的随机变量XQ返回P(X<=x)Q后者则q回P(x0<=X<=x1)。正如其名所C,q样也就得到?jin)概率?
阅读全文
]]> - Commons Math学习(fn)W记——函数方E求?/title>http://www.tkk7.com/changedi/archive/2010/12/21/341256.htmlchangedichangediTue, 21 Dec 2010 09:18:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/21/341256.htmlhttp://www.tkk7.com/changedi/comments/341256.htmlhttp://www.tkk7.com/changedi/archive/2010/12/21/341256.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/341256.htmlhttp://www.tkk7.com/changedi/services/trackbacks/341256.html
阅读全文
]]> - Commons Math学习(fn)W记——函数积?/title>http://www.tkk7.com/changedi/archive/2010/12/19/341116.htmlchangedichangediSun, 19 Dec 2010 13:27:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/19/341116.htmlhttp://www.tkk7.com/changedi/comments/341116.htmlhttp://www.tkk7.com/changedi/archive/2010/12/19/341116.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/341116.htmlhttp://www.tkk7.com/changedi/services/trackbacks/341116.html阅读全文

]]> - Commons Math学习(fn)W记——函数插?http://www.tkk7.com/changedi/archive/2010/12/16/340932.htmlchangedichangediThu, 16 Dec 2010 14:30:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/16/340932.htmlhttp://www.tkk7.com/changedi/comments/340932.htmlhttp://www.tkk7.com/changedi/archive/2010/12/16/340932.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/340932.htmlhttp://www.tkk7.com/changedi/services/trackbacks/340932.html
插值是数学领域数值分析中的通过已知的离散数据求未知数据的过E或Ҏ(gu)。给定n个离散数据点Q称点)(j)(xk,yk)Qk= 1,2,...,n。对于,求x所对应的y的值称为内插。f(x)为定义在区间[a,b]上的函数。x1,x2,x3...xn为[a,b]上n个互不相同的点,G为给定的某意函数cR若G上有函数g(x)满Q?g(xi) = f(xi),k = 1,2,...n
则称g(x)为f(x)关于节点x1,x2,x3...xn在G上的插值函?
阅读全文
]]> - Commons Math学习(fn)W记——多式函数http://www.tkk7.com/changedi/archive/2010/12/15/340745.htmlchangedichangediWed, 15 Dec 2010 02:48:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/15/340745.htmlhttp://www.tkk7.com/changedi/comments/340745.htmlhttp://www.tkk7.com/changedi/archive/2010/12/15/340745.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/340745.htmlhttp://www.tkk7.com/changedi/services/trackbacks/340745.html
阅读全文
]]> - Commons Math学习(fn)W记——函?/title>http://www.tkk7.com/changedi/archive/2010/12/14/340694.htmlchangedichangediTue, 14 Dec 2010 11:39:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/14/340694.htmlhttp://www.tkk7.com/changedi/comments/340694.htmlhttp://www.tkk7.com/changedi/archive/2010/12/14/340694.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/340694.htmlhttp://www.tkk7.com/changedi/services/trackbacks/340694.html
阅读全文
]]> - Commons Math学习(fn)W记——矩阵分?/title>http://www.tkk7.com/changedi/archive/2010/12/13/340441.htmlchangedichangediMon, 13 Dec 2010 01:39:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/13/340441.htmlhttp://www.tkk7.com/changedi/comments/340441.htmlhttp://www.tkk7.com/changedi/archive/2010/12/13/340441.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/340441.htmlhttp://www.tkk7.com/changedi/services/trackbacks/340441.html
矩阵分解主要有三U方式:(x)LU分解QQR分解和奇异值分解。当然在Math的linear包中提供?jin)对应的接口有CholeskyDecomposition、EigenDecomposition、LUDecomposition、QRDecomposition和SingularValueDecompositionq?U分解方式?
阅读全文
]]> - Commons Math学习(fn)W记——矩?http://www.tkk7.com/changedi/archive/2010/12/11/340372.htmlchangedichangediSat, 11 Dec 2010 13:12:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/11/340372.htmlhttp://www.tkk7.com/changedi/comments/340372.htmlhttp://www.tkk7.com/changedi/archive/2010/12/11/340372.html#Feedback3http://www.tkk7.com/changedi/comments/commentRss/340372.htmlhttp://www.tkk7.com/changedi/services/trackbacks/340372.html
Math包org.apache.commons.math.linear里对矩阵的表C是有一个层ơ结构的?
最层的AnyMatrix是一个基本的interface。下面有3个sub interfaceQBigMatrix, FieldMatrix, RealMatrix。而每个sub interface分别被相应的矩阵cd现。整个矩늚层次l构也就出来?jin)。不q其中的BigMatrix已经不用?jin)。被Array2DRowFieldMatrix替代?jin)?
阅读全文
]]> - Commons Math学习(fn)W记——向?http://www.tkk7.com/changedi/archive/2010/12/10/340286.htmlchangedichangediFri, 10 Dec 2010 09:46:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/10/340286.htmlhttp://www.tkk7.com/changedi/comments/340286.htmlhttp://www.tkk7.com/changedi/archive/2010/12/10/340286.html#Feedback0http://www.tkk7.com/changedi/comments/commentRss/340286.htmlhttp://www.tkk7.com/changedi/services/trackbacks/340286.html目录选择?/span>
今天先写W一:(x)向量—?/span>vector?/span>
Vector?/span>org.apache.commons.math.linear?/span>FieldVector?/span>AbstractRealVector, ArrayRealVector
中,可以看到ArrayRealVector的,?/span>2.0q是直接实现?/span>RealVector。可见代码的变化?/span>doc中说明了(jin)AbstractRealVector的,而且ArrayRealVector的。呵呵,一个不一致。其实是update date向量q个概念是线性代数的基础?/span>RealVector的实现时Z数组cd的?/span>RealVector?/span>
Q就像原来的api”?/span>
操作?/span>map***toself是返回新的实例的Q?/span>map***toself的实玎ͼ(x)
1
public RealVector mapAdd(double d)
{
2
double[] out = new double[data.length];
3
for (int i = 0; i < data.length; i++)
{
4
out[i] = data[i] + d;
5
}
6
return new ArrayRealVector(out);
7
}
8
?/span>mapAddToSelf()的实玎ͼ(x)
1
public RealVector mapAddToSelf(double d)
{
2
for (int i = 0; i < data.length; i++)
{
3
data[i] = data[i] + d;
4
}
5
return this;
6
}
7
new ArrayRealVector?/span>
1
/** *//**
2
*
3
*/
4
package algorithm.math;
5
6
import org.apache.commons.math.linear.ArrayRealVector;
7
import org.apache.commons.math.linear.RealVector;
8
9
/** *//**
10
* @author Jia Yu
11
* @date 2010-11-18
12
*/
13
public class VectorTest
{
14
15
/** *//**
16
* @param args
17
*/
18
public static void main(String[] args)
{
19
// TODO Auto-generated method stub
20
vector();
21
}
22
23
private static void vector()
{
24
// TODO Auto-generated method stub
25
double[] vec1 =
{ 1d, 2d, 3d };
26
double[] vec2 =
{ 4d, 5d, 6d };
27
ArrayRealVector v1 = new ArrayRealVector(vec1);
28
ArrayRealVector v2 = new ArrayRealVector(vec2);
29
30
// output directly
31
System.out.println("v1 is " + v1);
32
// dimension : size of vector
33
System.out.println("size is " + v1.getDimension());
34
// vector add
35
System.out.println("v1 + v2 = " + v1.add(v2));
36
System.out.println("v1 + v2 = " + v1.add(vec2));
37
// vector substract
38
System.out.println("v1 - v2 = " + v1.subtract(v2));
39
// vector element by element multiply
40
System.out.println("v1 * v2 = " + v1.ebeMultiply(v2));
41
// vector element by element divide
42
System.out.println("v1 / v2 = " + v1.ebeDivide(v2));
43
// get index at 1
44
System.out.println("v1[1] = " + v1.getEntry(1));
45
// vector append
46
RealVector t_vec = v1.append(v2);
47
System.out.println("v1 append v2 is " + t_vec);
48
// vector distance
49
System.out.println("distance between v1 and v2 is "
50
+ v1.getDistance(v2));
51
System.out.println("L1 distance between v1 and v2 is "
52
+ v1.getL1Distance(v2));
53
// vector norm
54
System.out.println("norm of v1 is " + v1.getNorm());
55
// vector dot product
56
System.out.println("dot product of v1 and v2 is " + v1.dotProduct(v2));
57
// vector outer product
58
System.out.println("outer product of v1 and v2 is "
59
+ v1.outerProduct(v2));
60
// vector orthogonal projection
61
System.out.println("hogonal projection of v1 and v2 is "
62
+ v1.projection(v2));
63
// vector map operations
64
System.out.println("Map the Math.abs(double) function to v1 is "
65
+ v1.mapAbs());
66
v1.mapInvToSelf();
67
System.out.println("Map the 1/x function to v1 itself is " + v1);
68
// vector get sub vector
69
System.out.println("sub vector of v1 is " + v1.getSubVector(0, 2));
70
}
71
72
}
73
对应的输出:(x)
库ؓ(f)我们提供?jin)这h便的向量表示Q在?/span>Java当然所有的研究要以文ZQ参看文档写代码q是必须做到的事情。所以,不要嫌麻?ch),赶紧?/span>api doccoding相关资料Q?/span>
包:(x)http://commons.apache.org/math/index.html

]]>
- Commons Math学习(fn)W记——目录(随时更新Q?/title>http://www.tkk7.com/changedi/archive/2010/12/10/340282.htmlchangedichangediFri, 10 Dec 2010 09:41:00 GMThttp://www.tkk7.com/changedi/archive/2010/12/10/340282.htmlhttp://www.tkk7.com/changedi/comments/340282.htmlhttp://www.tkk7.com/changedi/archive/2010/12/10/340282.html#Feedback2http://www.tkk7.com/changedi/comments/commentRss/340282.htmlhttp://www.tkk7.com/changedi/services/trackbacks/340282.html
目录
库的研究Q标准的名字应该?/span>The Apache Commons Mathematics Library上网一查,没查到相关的学习(fn)资料Q没办法Q自己对着代码一点点啃吧。正好打发闲散时光,q可以回֤?fn)一些数学知识,其实一个主要原因是本h本硕阶段均未学习(fn)qQ何数值分析方面的评Q实在?zhn)哀Q于是萌发了(jin)研究Math写数学代码的同学提供一点参考?/span>
库的l构设计的)(j)
Section 1 linear algebra 1Q?nbsp;Vector 2Q?nbsp;Matrix 3Q?nbsp;Matrix Decomposition 数学分析Q函Cؓ(f)主)(j)
1Q?nbsp;Function 2Q?nbsp;Polynomial 3Q?nbsp;Interpolation 4Q?nbsp;Integration 5Q?nbsp;Solver 概率和统?/span>
1distribution Q?/span>fraction and complex Q?/span>random and statistics cluster and regression聚类和回?/span>

]]>
վ֩ģ壺
Ƶվѿ|
|
һëƬڲ|
Ʒ123߹ۿ|
ҹƵ߹ۿ|
2048Ʒ|
ֳִˬƵ|
Ѹ߲|
99Ƶ߿|
ƷѾþ|
ҳַѹۿ|
vaĻ|
鶹ɫۺվ|
Ƶ߹ۿ|
Ʒ|
avѿ|
˵va
|
Ʒa|
ۺƷ͵|
߹ۿѴվ|
þþþþþþѿ|
18Ůȴ|
þ99Ʒѿ|
þóѴƬ|
ձһ|
99þѹƷ|
߹ۿƵ|
a߹ۿƵ|
ֻƬ|
ѹƵ|
þþþƷƵ|
þƵ|
2021ƷƷѹۿ|
ÿձƬ35
|
˳77777վ|
һҹ|
2020Ʒר|
Ʒѹۿ|
ޱAAAר|
Ʒ߹ۿ|
߾ƷAAVV|