Browsed by
分类:技术

好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
11 views
oracle中 sql截取某个字符前面和后面的值

oracle中 sql截取某个字符前面和后面的值

直接看代码:

–创建测试表及数据
create table test
(name varchar2(10));

insert into test values (‘2-15’);
insert into test values (‘2-33’);
insert into test values (‘2-3′);
insert into test values (’12-8′);
insert into test values (’12-22′);
insert into test values (’12-3′);

–执行

select name,substr(name,1,instr(name,’-‘)-1) 前,

substr(name,instr(name,’-‘)+1,length(name)-instr(name,’-‘)) 后

from test

–结果
NAME 前 后
2-15 2 15
2-33 2 33
2-3 2 3

–end–

 

好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
11 views
Oracle-分区表案例

Oracle-分区表案例

Oracle-分区表案例

Oracle数据库开发了解分区表
分区就是将一个非常大的表或者索引物理地分解为多个较小的 可独立管理的部分.
分区表或索引在逻辑上是一个表或一个索引,但物理上是由多个物理分区组成的.
分区功能通过改善可管理性 性能 可用性,为各种应用系统带来了极大的好处.
分区功能的好处:
1.增强数据可用性:如果表的一个分区因故障或者维护而不能使用时,表的其余分区仍是可用的;
2.维护方便:独立管理多个分区,比维护单个大表要轻松;
3.均衡I/O:可以把不同分区映射到磁盘以平衡I/O,显著改善性能;
4.改善查询性能:对已分区对象的某些查询可以运行更快,因为搜索仅限于关心的分区;

分区表有哪些??
Oracle 11g 提供6种表分区方法:范围分区(range) 散列分区(hash) 列表分区(list)
符合分区 间隔分区 引用分区.


按表中某个列值的范围进行分区,根据该列的值决定将数据存储在哪个分区上.

创建范围分区需注意以下几点:
1.指明分区方法,分区列,和分区描述
2.每一个分区都有values less than子句
3.在最高分区中定义maxvalue,这个maxvalue值高区其他分区中的任何键值.
例:创建范围分区

create table range_orders
(order_id varchar2(10) constraint OR_PK primary key
,order_date date default sysdate
,qty integer
,payterms varchar2(10)
,book_id number(6)
)
partition by range (order_date)
(partition p1 values less than (to_date(‘20140331′,’yyyymmdd’)) tablespace user01,
partition p2 values less than (to_date(‘20140430′,’yyyymmdd’)) tablespace user02,
partition p3 values less than (to_date(‘20140531′,’yyyymmdd’)) tablespace user03
)
;

SQL> insert into range_orders values (‘10001’,to_date(‘20140321′,’yyyymmdd’),1,’payterm_1′,110345);
1 row inserted
SQL> insert into range_orders values (‘10002’,to_date(‘20140421′,’yyyymmdd’),1,’payterm_2′,110745);
1 row inserted
SQL> insert into range_orders values (‘10003’,to_date(‘20140521′,’yyyymmdd’),1,’payterm_3′,110945);
1 row inserted

SQL> commit;
Commit complete

SQL> select rowid,r.* from range_orders r;
ROWID ORDER_ID ORDER_DATE QTY PAYTERMS BOOK_ID
—————— ———- ———– —- ———- ——-
AAADwpAAGAAAACFAAA 10001 2014/3/21 1 payterm_1 110345
AAADwqAAHAAAACFAAA 10002 2014/4/21 1 payterm_2 110745
AAADwrAAIAAAACFAAA 10003 2014/5/21 1 payterm_3 110945

可以看到AAG,AAH,AAI 分别代表了三条数据的文件号是6,7,8
备注:这个地方可以看下http://blog.itpub.net/28929558/viewspace-1150766/ 了解rowid
验证下

SQL> select x.FILE#,x.NAME from v$datafile x;

FILE# NAME
———- —————————————-
1 D:\ORACLE\ORADATA\CRISS_DB\SYSTEM01.DBF
2 D:\ORACLE\ORADATA\CRISS_DB\SYSAUX01.DBF
3 D:\ORACLE\ORADATA\CRISS_DB\UNDOTBS01.DBF
4 D:\ORACLE\ORADATA\CRISS_DB\USERS01.DBF
5 D:\ORACLE\ORADATA\CRISS_DB\TEST01.DBF
6 D:\ORACLE\ORADATA\CRISS_DB\USER01.DBF
7 D:\ORACLE\ORADATA\CRISS_DB\USER02.DBF
8 D:\ORACLE\ORADATA\CRISS_DB\USER03.DBF

8 rows selected


散列分区指一个或多个列上应用一个散列函数,数根据该散列值存放在不同的分区中.
通过散列分区,可以将数据比较均匀地分布到各个分区中.

例:创建散列分区表

SQL> create table hash_orders

(order_id varchar2(10) constraint HOR_PK primary key
,order_date date default sysdate
,qty integer
,payterms varchar2(10)
,book_id number(6)
)
partition by hash(order_id)
( partition hash_p1 tablespace user01
,partition hash_p2 tablespace user02
);

 

Table created

SQL> insert into hash_orders select * from range_orders;

3 rows inserted

 

SQL> select rowid,h.* from hash_orders h;

ROWID ORDER_ID ORDER_DATE QTY PAYTERMS BOOK_ID
—————— ———- —————– ———- ——-
AAADyIAAGAAAACNAAA 10002 2014/4/21 1 payterm_2 110745
AAADyIAAGAAAACNAAB 10003 2014/5/21 1 payterm_3 110945
AAADyJAAHAAAACNAAA 10001 2014/3/21 1 payterm_1 110345

 

继续: http://blog.csdn.net/woshimyc/article/details/73289798

 

-end-

好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
14 views
Oracle的with..as的用法

Oracle的with..as的用法

Oracle with..as 的用法

— code = SQL —

–语法:

with tempName as (select ….)
select …

 

–例:现在要从1-19中得到11-14。一般的sql如下:
select * from
(
–模拟生一个20行的数据
SELECT LEVEL AS lv
FROM DUAL
CONNECT BY LEVEL ) tt
WHERE tt.lv > 10 AND tt.lv

 

–使用With as 的SQL为:
with TT as(
–模拟生一个20行的数据
SELECT LEVEL AS lv
FROM DUAL
CONNECT BY LEVEL )
select lv from TT
WHERE lv > 10 AND lv

/*With查询语句不是以select开始的,而是以“WITH”关键字开头
可认为在真正进行查询之前预先构造了一个临时表TT,之后便可多次使用它做进一步的分析和处理

WITH Clause方法的优点
增加了SQL的易读性,如果构造了多个子查询,结构会更清晰;更重要的是:“一次分析,多次使用”,
这也是为什么会提供性能的地方,达到了“少读”的目标。
第一种使用子查询的方法表被扫描了两次,而使用WITH Clause方法,表仅被扫描一次。
这样可以大大的提高数据分析和查询的效率。
另外,观察WITH Clause方法执行计划,其中“SYS_TEMP_XXXX”便是在运行过程中构造的中间统计结果临时表。*/

—–*************************在视图中使用WITH语句进行连接
CREATE OR REPLACE VIEW WITH_V AS
WITH DEPT_V AS (SELECT * FROM DEPT),
EMP_V AS (SELECT * FROM EMP)
SELECT D.DNAME,D.LOC,E.* FROM EMP_V E
LEFT JOIN DEPT_V D
ON D.DEPTNO = E.DEPTNO
—–*************************WITH语句的使用例子:
/*查询出部门的总薪水大于所有部门平均总薪水的部门。部门表s_dept,员工表s_emp。
分析:做这个查询,首先必须计算出所有部门的总薪水,然后计算出总薪水的平均薪水,
再筛选出部门的总薪水大于所有部门总薪水平均薪水的部门。那么第1 步with 查询查出所有部门的总薪水,
第2 步用with 从第1 步获得的结果表中查询出平均薪水,最后利用这两次 的with 查询比较总薪水大于平均薪水的结果,如下:
*/

WITH DEPT_COSTS AS –查询出部门的总工资WITH DEPT_COSTS AS –查询出部门的总工资 (SELECT D.DNAME, SUM(E.SAL) DEPT_TOTAL FROM DEPT D, EMP E WHERE E.DEPTNO = D.DEPTNO GROUP BY D.DNAME), AVE_COST AS –查询出部门的平均工资,在后一个WITH语句中可以引用前一个定义的WITH语句 (SELECT SUM(DEPT_TOTAL) / COUNT(*) AVG_SUM FROM DEPT_COSTS) SELECT—–************************* /*一、WITH AS的含义        WITH AS短语,也叫做子查询部分(subquery factoring),可以定义一个SQL片断,该SQL片断会被整个SQL语句所用到。 特别对于UNION ALL比较有用。因为UNION ALL的每个部分可能相同,但是如果每个部分都去执行一遍的话,则成本太高, 所以可以使用WITH AS短语,则只要执行一遍即可。如果WITH AS短语所定义的表名被调用两次以上,则优化器会自动将WITH AS 短语所获取的数据放入一个TEMP表里,如果只是被调用一次,则不会。而提示materialize则是强制将WITH AS短语里的数据放入 一个全局临时表里。很多查询通过这种方法都可以提高速度。
二、WITH AS的使用实例从A表中查询某个字段出来,如果没有从B表中查询,如果A,B表中都没有,则输出ERROR*/

with sql1 as
(select to_char(id) myid from a),
sql2 as
(select to_char(id) myid from b where not exists
(select * from sql1 where rownum > 5))
select * from sql1
union all
select * from sql2
union all
select ‘error’ from dual
where not exists (select * from sql1 where rownum > 5) and not exists
(select * from sql2 where rownum > 2);

–code–

 

–end–

好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
48 views
左耳朵耗子给出的学习指南

左耳朵耗子给出的学习指南

你是否觉得自己从学校毕业的时候只做过小玩具一样的程序?走入职场后哪怕没有什么经验也可以把以下这些课外练习走一遍(朋友的抱怨:学校课程总是从理论出发,作业项目都看不出有什么实际作用,不如从工作中的需求出发)

建议:

· 不要乱买书,不要乱追新技术新名词,基础的东西经过很长时间积累而且还会在未来至少10年通用。

· 回顾一下历史,看看历史上时间线上技术的发展,你才能明白明天会是什么样。

· 一定要动手,例子不管多么简单,建议至少自己手敲一遍看看是否理解了里头的细枝末节。

· 一定要学会思考,思考为什么要这样,而不是那样。还要举一反三地思考。

注:你也许会很奇怪为什么下面的东西很偏Unix/Linux,这是因为我觉得Windows下的编程可能会在未来很没有前途,原因如下:

· 现在的用户界面几乎被两个东西主宰了,1)Web,2)移动设备iOS或Android。Windows的图形界面不吃香了。

· 越来越多的企业在用成本低性能高的Linux和各种开源技术来构架其系统,Windows的成本太高了。

· 微软的东西变得太快了,很不持久,他们完全是在玩弄程序员。详情参见《Windows编程革命史

所以,我个人认为以后的趋势是前端是Web+移动,后端是Linux+开源。开发这边基本上没Windows什么事。

启蒙入门

1、 学习一门脚本语言,例如Python/Ruby

可以让你摆脱对底层语言的恐惧感,脚本语言可以让你很快开发出能用得上的小程序。实践项目:

· 处理文本文件,或者csv (关键词 python csv, python open, python sys) 读一个本地文件,逐行处理(例如 word count,或者处理log)

· 遍历本地文件系统 (sys, os, path),例如写一个程序统计一个目录下所有文件大小并按各种条件排序并保存结果

· 跟数据库打交道 (python sqlite),写一个小脚本统计数据库里条目数量

· 学会用各种print之类简单粗暴的方式进行调试

· 学会用Google (phrase, domain, use reader to follow tech blogs)

为什么要学脚本语言,因为他们实在是太方便了,很多时候我们需要写点小工具或是脚本来帮我们解决问题,你就会发现正规的编程语言太难用了。

2、 用熟一种程序员的编辑器(不是IDE) 和一些基本工具

· Vim / Emacs / Notepad++,学会如何配置代码补全,外观,外部命令等。

· Source Insight (或 ctag)

使用这些东西不是为了Cool,而是这些编辑器在查看、修改代码/配置文章/日志会更快更有效率。

3、 熟悉Unix/Linux Shell和常见的命令行

· 如果你用windows,至少学会用虚拟机里的linux, vmware player是免费的,装个Ubuntu吧

· 一定要少用少用图形界面。

· 学会使用man来查看帮助

· 文件系统结构和基本操作 ls/chmod/chown/rm/find/ln/cat/mount/mkdir/tar/gzip …

· 学会使用一些文本操作命令 sed/awk/grep/tail/less/more …

· 学会使用一些管理命令 ps/top/lsof/netstat/kill/tcpdump/iptables/dd…

· 了解/etc目录下的各种配置文章,学会查看/var/log下的系统日志,以及/proc下的系统运行信息

· 了解正则表达式,使用正则表达式来查找文件。

对于程序员来说Unix/Linux比Windows简单多了。(参看我四年前CSDN的博文《其实Unix很简单》)学会使用Unix/Linux你会发现图形界面在某些时候实在是太难用了,相当地相当地降低工作效率。

4、 学习Web基础(HTML/CSS/JS) + 服务器端技术 (LAMP)

未来必然是Web的世界,学习WEB基础的最佳网站是W3School

· 学习HTML基本语法

· 学习CSS如何选中HTML元素并应用一些基本样式(关键词:box model)

· 学会用  Firefox + Firebug 或 chrome 查看你觉得很炫的网页结构,并动态修改。

· 学习使用 Javascript操纵HTML元件。理解DOM和动态网页(http://oreilly.com/catalog/9780596527402) 网上有免费的章节,足够用了。或参看 DOM 。

· 学会用  Firefox + Firebug 或 chrome 调试 Javascript代码(设置断点,查看变量,性能,控制台等)

· 在一台机器上配置Apache或 Nginx

· 学习PHP,让后台PHP和前台HTML进行数据交互,对服务器相应浏览器请求形成初步认识。实现一个表单提交和反显的功能。

· 把PHP连接本地或者远程数据库 MySQL(MySQL 和 SQL现学现用够了)

· 跟完一个名校的网络编程课程(例如:http://www.stanford.edu/~ouster/cgi-bin/cs142-fall10/index.php ) 不要觉得需要多于一学期时间,大学生是全职一学期选3-5门课,你业余时间一定可以跟上

· 学习一个 javascript库(例如jQuery 或 ExtJS)+  Ajax (异步读入一个服务器端图片或者

· 做个小网数据库内容)+JSON数据格式。

· HTTP: The Definitive Guide 读完前4章你就明白你每天上网用浏览器的时候发生的事情了(proxy, gateway, browsers)

· 站(例如:一个小的留言板,支持用户登录,Cookie/Session,增、删、改、查,上传图片附件,分页显示)

· 买个域名,租个空间,做个自己的网站。

进阶加深

1、 C语言和操作系统调用

· 重新学C语言,理解指针和内存模型,用C语言实现一下各种经典的算法和数据结构。推荐《计算机程序设计艺术》、《算法导论》和《编程珠玑》。

· 学习(麻省理工免费课程)计算机科学和编程导论

· 学习(麻省理工免费课程)C语言内存管理

· 学习Unix/Linux系统调用(Unix高级环境编程),,了解系统层面的东西。

o 用这些系统知识操作一下文件系统,用户(实现一个可以拷贝目录树的小程序)

o 用fork/wait/waitpid写一个多进程的程序,用pthread写一个多线程带同步或互斥的程序。多进程多进程购票的程序。

o 用signal/kill/raise/alarm/pause/sigprocmask实现一个多进程间的信号量通信的程序。

o 学会使用gcc和gdb来编程和调试程序(参看我的《用gdb调试程序》)

o 学会使用makefile来编译程序。(参看我的《跟我一起写makefile》)

o IPC和Socket的东西可以放到高级中来实践。

· 学习Windows SDK编程(Windows 程序设计 MFC程序设计

o 写一个窗口,了解WinMain/WinProcedure,以及Windows的消息机制。

o 写一些程序来操作Windows SDK中的资源文件或是各种图形控件,以及作图的编程。

o 学习如何使用MSDN查看相关的SDK函数,各种WM_消息以及一些例程。

o 这本书中有很多例程,在实践中请不要照抄,试着自己写一个自己的例程。

o 不用太多于精通这些东西,因为GUI正在被Web取代,主要是了解一下Windows 图形界面的编程。@virushuo 说:“ 我觉得GUI确实不那么热门了,但充分理解GUI工作原理是很重要的。包括移动设备开发,如果没有基础知识仍然很吃力。或者说移动设备开发必须理解GUI工作,或者在win那边学,或者在mac/iOS上学”。

2、学习 Java

·  Java 的学习主要是看经典的Core  Java 《 Java 核心技术编程》和《 Java编程思想》(有两卷,我仅链了第一卷,足够了,因为 Java的图形界面了解就可以了)

· 学习JDK,学会查阅 Java API Doc http://download.oracle.com/ javase/6/docs/api/

· 了解一下 Java这种虚拟机语言和C和Python语言在编译和执行上的差别。从C、 Java、Python思考一下“跨平台”这种技术。

· 学会使用IDE Eclipse,使用Eclipse 编译,调试和开发 Java程序。

· 建一个Tomcat的网站,尝试一下JSP/Servlet/JDBC/MySQL的Web开发。把前面所说的那个PHP的小项目试着用JSP和Servlet实现一下。

3、Web的安全与架构

· 学习HTML5,网上有很多很多教程,以前酷壳也介绍过很多,我在这里就不罗列了。

· 学习Web开发的安全问题(参考新浪微博被攻击的这个事,以及Ruby的这篇文章

· 学习HTTP Server的rewrite机制,Nginx的反向代理机制,fast-cgi(如:PHP-FPM

· 学习Web的静态页面缓存技术。

· 学习Web的异步工作流处理,数据Cache,数据分区,负载均衡,水平扩展的构架。

实践任务:

o 使用HTML5的canvas 制作一些Web动画。

o 尝试在前面开发过的那个Web应用中进行SQL注入,JS注入,以及XSS攻击。

o 把前面开发过的那个Web应用改成构造在Nginx + PHP-FPM + 静态页面缓存的网站

4、学习关系型数据库

· 你可以安装MSSQLServer或MySQL来学习数据库。

· 学习教科书里数据库设计的那几个范式,1NF,2NF,3NF,……

· 学习数据库的存过,触发器,视图,建索引,游标等。

· 学习SQL语句,明白表连接的各种概念(参看《SQL  Join的图示》)

· 学习如何优化数据库查询(参看《MySQL的优化》)

· 实践任务:设计一个论坛的数据库,至少满足3NF,使用SQL语句查询本周,本月的最新文章,评论最多的文章,最活跃用户。

5、一些开发工具

· 学会使用SVN或Git来管理程序版本。

· 学会使用JUnit来对 Java进行单元测试。

· 学习C语言和 Java语言的coding standard 或 coding guideline。(我N年前写过一篇关C语言非常简单的文章——《编程修养》,这样的东西你可以上网查一下,一大堆)。

· 推荐阅读《代码大全》《重构》《代码整洁之道

高级深入

1、C++ /  Java 和面向对象

我个人以为学好C++, Java也就是举手之劳。但是C++的学习曲线相当的陡。不过,我觉得C++是最需要学好的语言了。参看两篇趣文“C++学习信心图” 和“21天学好C++

· 学习(麻省理工免费课程)C++面向对象编程

· 读我的 “如何学好C++”中所推荐的那些书至少两遍以上(如果你对C++的理解能够深入到像我所写的《C++虚函数表解析》或是《C++对象内存存局)()》,或是《C/C++返回内部静态成员的陷阱》那就非常不错了)

· 然后反思为什么C++要干成这样, Java则不是?你一定要学会对比C++和 Java的不同。比如, Java中的初始化,垃圾回收,接口,异常,虚函数,等等。

实践任务:

o 用C++实现一个BigInt,支持128位的整形的加减乘除的操作。

o 用C++封装一个数据结构的容量,比如hash table。

o 用C++封装并实现一个智能指针(一定要使用模板)。

· 《设计模式》必需一读,两遍以上,思考一下,这23个模式的应用场景。主要是两点:1)钟爱组合而不是继承,2)钟爱接口而不是实现。(也推荐《深入浅出设计模式》)

实践任务:

o 使用工厂模式实现一个内存池。

o 使用策略模式制做一个类其可以把文本文件进行左对齐,右对齐和中对齐。

o 使用命令模式实现一个命令行计算器,并支持undo和redo。

o 使用修饰模式实现一个酒店的房间价格订价策略——旺季,服务,VIP、旅行团、等影响价格的因素。

· 学习STL的用法和其设计概念  – 容器,算法,迭代器,函数子。如果可能,请读一下其源码。

· 实践任务:尝试使用面向对象、STL,设计模式、和WindowsSDK图形编程的各种技能

o 做一个贪吃蛇或是俄罗斯方块的游戏。支持不同的级别和难度。

o 做一个文件浏览器,可以浏览目录下的文件,并可以对不同的文件有不同的操作,文本文件可以打开编辑,执行文件则执行之,mp3或avi文件可以播放,图片文件可以展示图片。

· 学习C++的一些类库的设计,如: MFC(看看候捷老师的《深入浅出MFC》) ,Boost, ACE,  CPPUnit,STL (STL可能会太难了,但是如果你能了解其中的设计模式和设计那就太好了,如果你能深入到我写的《STL string类的写时拷贝技术》那就非常不错了,ACE需要很强在的系统知识,参见后面的“加强对系统的了解”)

·  Java是真正的面向对象的语言, Java的设计模式多得不能再多,也是用来学习面向对象的设计模式的最佳语言了(参看 Java中的设计模式)。

· 推荐阅读《Effective  Java》 and 《 Java解惑》

· 学习 Java的框架, Java的框架也是多,如Spring, Hibernate,Struts 等等,主要是学习 Java的设计,如IoC等。

·  Java的技术也是烂多,重点学习J2EE架构以及JMS, RMI, 等消息传递和远程调用的技术。

· 学习使用 Java做Web Service (官方教程在这里

· 实践任务:尝试在Spring或Hibernate框架下构建一个有网络的Web Service的远程调用程序,并可以在两个Service中通过JMS传递消息。

C++和 Java都不是能在短时间内能学好的,C++玩是的深, Java玩的是广,我建议两者选一个。我个人的学习经历是:

· 深究C++(我深究C/C++了十来年了)

· 学习 Java的各种设计模式。

2、加强系统了解

重要阅读下面的几本书:

· 《Unix编程艺术》了解Unix系统领域中的设计和开发哲学、思想文化体系、原则与经验。你一定会有一种醍醐灌顶的感觉。

· 《Unix网络编程卷1,套接字》这是一本看完你就明白网络编程的书。重要注意TCP、UDP,以及多路复用的系统调用select/poll/epoll的差别。

· 《TCP/IP详解 卷1:协议》- 这是一本看完后你就可以当网络黑客的书。了解以太网的的运作原理,了解TCP/IP的协议,运作原理以及如何TCP的调优。

实践任务:

o 理解什么是阻塞(同步IO),非阻塞(异步IO),多路复用(select, poll, epoll)的IO技术。

o 写一个网络聊天程序,有聊天服务器和多个聊天客户端(服务端用UDP对部分或所有的的聊天客户端进Multicast或Broadcast)。

o 写一个简易的HTTP服务器。

· 《Unix网络编程卷2,进程间通信》信号量,管道,共享内存,消息等各种IPC…… 这些技术好像有点老掉牙了,不过还是值得了解。

实践任务:

o 主要实践各种IPC进程序通信的方法。

o 尝试写一个管道程序,父子进程通过管道交换数据。

o 尝试写一个共享内存的程序,两个进程通过共享内存交换一个C的结构体数组。

· 学习《Windows核心编程》一书。把CreateProcess,Windows线程、线程调度、线程同步(Event,  信号量,互斥量)、异步I/O,内存管理,DLL,这几大块搞精通。

· 实践任务:使用CreateProcess启动一个记事本或IE,并监控该程序的运行。把前面写过的那个简易的HTTP服务用线程池实现一下。写一个DLL的钩子程序监控指定窗口的关闭事件,或是记录某个窗口的按键。

· 有了多线程、多进程通信,TCP/IP,套接字,C++和设计模式的基本,你可以研究一下ACE了。使用ACE重写上述的聊天程序和HTTP服务器(带线程池)

· 实践任务:通过以上的所有知识,尝试

o 写一个服务端给客户端传大文件,要求把100M的带宽用到80%以上。(注意,磁盘I/O和网络I/O可能会很有问题,想一想怎么解决,另外,请注意网络传输最大单元MTU)

o 了解BT下载的工作原理,用多进程的方式模拟BT下载的原理。

3、系统架构

· 负载均衡。HASH式的,纯动态式的。(可以到Google学术里搜一些关于负载均衡的文章读读)

· 多层分布式系统 – 客户端服务结点层、计算结点层、数据cache层,数据层。J2EE是经典的多层结构。

· CDN系统– 就近访问,内容边缘化。

· P2P式系统,研究一下BT和电驴的算法。比如:DHT算法

· 服务器备份,双机备份系统(Live-Standby和Live-Live系统),两台机器如何通过心跳监测对方?集群主结点备份。

· 虚拟化技术,使用这个技术,可以把操作系统当应用程序一下切换或重新配置和部署。

· 学习Thrift,二进制的高性能的通讯中间件,支持数据(对象)序列化和多种类型的RPC服务。

· 学习Hadoop。Hadoop框架中最核心的设计就是:MapReduce和HDFS。MapReduce的思想是由Google的一篇论文所提及而被广为流传的,简单的一句话解释MapReduce就是“任务的分解与结果的汇总”。HDFS是Hadoop分布式文件系统(Hadoop Distributed File System)的缩写,为分布式计算存储提供了底层支持。

了解NoSQL数据库(有人说可能是一个过渡炒作的技术),不过因为超大规模以及高并发的纯动态型网站日渐成为主流,而SNS类网站在数据存取过程中有着实时性等刚性需求,这使得目前NoSQL数据库慢慢成了人们所关注的焦点,并大有成为取代关系型数据库而成为未来主流数据存储模式的趋势。当前NoSQL数据库很多,大部分都是开源的,其中比较知名的有:MemcacheDB、Redis、

好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
332 views
Java日期格式化

Java日期格式化

Java日期格式化

 

import java.text.ParseException;

import java.text.SimpleDateFormat;

import java.util.Date;

import java.util.Locale;

public class CX {

public static void main(String[] args) throws ParseException {

SimpleDateFormat sdf;

Date date; String s, s2 ;

// Jul 20, 2016 2:05:09 pm

String ds = “Jul 20, 2016 2:05:09 pm”;

date = new Date();

sdf = new SimpleDateFormat(“MMMMM dd,yyyy hh:mm:ss a”, Locale.ENGLISH );

s = sdf.format(date);

System.out.println(s);

date = sdf.parse(s);

System.out.println(“date = ” + date);

date = sdf.parse(ds);

System.out.println(“date2 = ” + date);

sdf = new SimpleDateFormat(“MMM dd,yyyy hh:mm:ss aa”, Locale.ENGLISH);

s = sdf.format(date); System.out.println(s);

date = sdf.parse(s);

System.out.println(“date = ” + date);

date = sdf.parse(ds);

System.out.println(“date2 = ” + date); }
}

输出:

June 22,2017 03:33:48 PM

date = Thu Jun 22 15:33:48 CST 2017

date2 = Wed Jul 20 14:05:09 CST 2016

Jul 20,2016 02:05:09 PM

date = Wed Jul 20 14:05:09 CST 2016

date2 = Wed Jul 20 14:05:09 CST 2016

好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
310 views
好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
341 views
WordPress Backups « WordPress Codex

WordPress Backups « WordPress Codex

WordPress Backups

Note: Want to skip the hard stuff? Skip to Automated Solutions such asWordPress Plugins for backups.

Your WordPress database contains every post, every comment and every link you have on your blog. If your database gets erased or corrupted, you stand to lose everything you have written. There are many reasons why this could happen and not all are things you can control. With a proper backup of your WordPress database and files, you can quickly restore things back to normal.

Instructions to back up your WordPress site include:

  1. WordPress Site and your WordPress Database
  2. Automatic WordPress backup options

In addition, support is provided online at the WordPress Support Forum to help you through the process.

Site backups are essential because problems inevitably occur and you need to be in a position to take action when disaster strikes. Spending a few minutes to make an easy, convenient backup of your database will allow you to spend even more time being creative and productive with your website.

来源: WordPress Backups « WordPress Codex

好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
340 views
好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
304 views
Making Photos Smaller Without Quality Loss – by Yelp

Making Photos Smaller Without Quality Loss – by Yelp

Making Photos Smaller Without Quality Loss

Yelp has over 100 million user-generated photos ranging from pictures of dinners or haircuts, to one of our newest features, #yelfies. These images account for a majority of the bandwidth for users of the app and website, and represent a significant cost to store and transfer. In our quest to give our users the best experience, we worked hard to optimize our photos and were able to achieve a 30% average size reduction. This saves our users time and bandwidth and reduces our cost to serve those images. Oh, and we did it all without reducing the quality of these images!

Background

Yelp has been storing user-uploaded photos for over 12 years. We save lossless formats (PNG, GIF) as PNGs and all other formats as JPEG. We use Python and Pillow for saving images, and start our story of photo uploads with a snippet like this:

With this as a starting point, we began to investigate potential optimizations on file size that we could apply without a loss in quality.

Optimizations

First, we had to decide whether to handle this ourselves or let a CDN provider magically change our photos. With the priority we place on high quality content, it made sense to evaluate options and make potential size vs quality tradeoffs ourselves. We moved ahead with research on the current state of photo file size reduction – what changes could be made and how much size / quality reduction was associated with each. With this research completed, we decided to work on three primary categories. The rest of this post explains what we did and how much benefit we realized from each optimization.

  1. Changes in Pillow
    • Optimize flag
    • Progressive JPEG
  2. Changes to application photo logic
    • Large PNG detection
    • Dynamic JPEG quality
  3. Changes to JPEG encoder
    • Mozjpeg (trellis quantization, custom quantization matrix)

Changes in Pillow

Optimize Flag

This is one of the easiest changes we made: enabling the setting in Pillow responsible for additional file size savings at the cost of CPU time (optimize=True). Due to the nature of the tradeoff being made, this does not impact image quality at all.

For JPEG, this flag instructs the encoder to find the optimal Huffman coding by making an additional pass over each image scan. Each first pass, instead of writing to file, calculates the occurrence statistics of each value, required information to compute the ideal coding. PNG internally uses zlib, so the optimize flag in that case effectively instructs the encoder to use gzip -9 instead of gzip -6.

This is an easy change to make but it turns out that it is not a silver bullet, reducing file size by just a few percent.

Progressive JPEG

When saving an image as a JPEG, there are a few different types you can choose from:

  • Baseline JPEG images load from top to bottom.
  • Progressive JPEG images load from more blurry to less blurry. The progressive option can easily be enabled in Pillow (progressive=True). As a result, there is a perceived performance increase (that is, it’s easier to notice when an image is partially absent than it is to tell it’s not fully sharp).

Additionally, the way progressive files are packed generally results in a small reduction to file size. As more fully explained by the Wikipedia article, JPEG format uses a zigzag pattern over the 8×8 blocks of pixels to do entropy coding. When the values of those blocks of pixels are unpacked and laid out in order, you generally have non-zero numbers first and then sequences of 0s, with that pattern repeating and interleaved for each 8×8 block in the image. With progressive encoding, the order of the unwound pixel blocks changes. The higher value numbers for each block come first in the file, (which gives the earliest scans of a progressive image its distinct blockiness), and the longer spans of small numbers, including more 0s, that add the finer details are towards the end. This reordering of the image data doesn’t change the image itself, but does increase the number of 0s that might be in a row (which can be more easily compressed).

Comparison with a delicious user-contributed image of a donut (click for larger):

(left) A mock of how a baseline JPEG renders.

(left) A mock of how a baseline JPEG renders.

(right) A mock of how a progressive JPEG renders.

(right) A mock of how a progressive JPEG renders.

Changes to Application Photo Logic

Large PNG Detection

Yelp targets two image formats for serving user-generated content – JPEG and PNG. JPEG is a great format for photos but generally struggles with high-contrast design content (like logos). By contrast, PNG is fully-lossless, so great for graphics but too large for photos where small distortions are not visible. In the cases where users upload PNGs that are actually photographs, we can save a lot of space if we identify these files and save them as JPEG instead. Some common sources of PNG photos on Yelp are screenshots taken by mobile devices and apps that modify photos to add effects or borders.

(left) A typical composited PNG upload with logo and border. (right) A typical PNG upload from a screenshot.

(left) A typical composited PNG upload with logo and border. (right) A typical PNG upload from a screenshot.

We wanted to reduce the number of these unnecessary PNGs, but it was important to avoid overreaching and changing format or degrading quality of logos, graphics, etc. How can we tell if something is a photo? From the pixels?

Using an experimental sample of 2,500 images, we found that a combination of file size and unique pixels worked well to detect photos. We generate a candidate thumbnail image at our largest resolution and see if the output PNG file is larger than 300KiB. If it is, we’ll also check the image contents to see if there are over 2^16 unique colors (Yelp converts RGBA image uploads to RGB, but if we didn’t, we would check that too).

In the experimental dataset, these hand-tuned thresholds to define “bigness” captured 88% of the possible file size savings (i.e. our expected file size savings if we were to convert all of the images) without any false-positives of graphics being converted.

Dynamic JPEG Quality

The first and most well-known way to reduce the size of JPEG files is a setting called quality. Many applications capable of saving to the JPEG format specify quality as a number.

Quality is somewhat of an abstraction. In fact, there are separate qualities for each of the color channels of a JPEG image. Quality levels 0 – 100 map to different quantization tables for the color channels, determining how much data is lost (usually high frequency). Quantization in the signal domain is the one step in the JPEG encoding process that loses information.

The simplest way to reduce file size is to reduce the quality of the image, introducing more noise. Not every image loses the same amount of information at a given quality level though.

We can dynamically choose a quality setting which is optimized for each image, finding an ideal balance between quality and size. There are two ways to do this:

  • Bottom-up: These are algorithms that generate tuned quantization tables by processing the image at the 8×8 pixel block level. They calculate both how much theoretical quality was lost and how that lost data either amplifies or cancels out to be more or less visible to the human eye.
  • Top-down: These are algorithms that compare an entire image against an original version of itself and detect how much information was lost. By iteratively generating candidate images with different quality settings, we can choose the one that meets a minimum evaluated level by whichever evaluation algorithm we choose.

We evaluated a bottom-up algorithm, which in our experience did not yield suitable results at the higher end of the quality range we wanted to use (though it seems like it may still have potential in the mid-range of image qualities, where an encoder can begin to be more adventurous with the bytes it discards). Many of the scholarly papers on this strategy were published in the early 90s when computing power was at a premium and took shortcuts that option B addresses, such as not evaluating interactions across blocks.

So we took the second approach: use a bisection algorithm to generate candidate images at different quality levels, and evaluate each candidate image’s drop in quality by calculating its structural similarity metric (SSIM) using pyssim, until that value is at a configurable but static threshold. This enables us to selectively lower the average file size (and average quality) only for images which were above a perceivable decrease to begin with.

In the below chart, we plot the SSIM values of 2500 images regenerated via 3 different quality approaches.

  1. The original images made by the current approach at quality = 85 are plotted as the blue line.
  2. An alternative approach to lowering file size, changing quality = 80, is plotted as the red line.
  3. And finally, the approach we ended up using, dynamic quality, SSIM 80-85, in orange, chooses a quality for the image in the range 80 to 85 (inclusive) based on meeting or exceeding an SSIM ratio: a pre-computed static value that made the transition occur somewhere in the middle of the images range. This lets us lower the average file size without lowering the quality of our worst-quality images.

SSIMs of 2500 images with 3 different quality strategies.

SSIMs of 2500 images with 3 different quality strategies.

SSIM?

There are quite a few image quality algorithms that try to mimic the human vision system. We’ve evaluated many of these and think that SSIM, while older, is most suitable for this iterative optimization based on a few characteristics:

  1. Sensitive to JPEG quantization error
  2. Fast, simple algorithm
  3. Can be computed on PIL native image objects without converting images to PNG and passing them to CLI applications (see #2)

Example Code for Dynamic Quality:

There are a few other blog posts about this technique, here is one by Colt Mcanlis. And as we go to press, Etsy has published one here! High five, faster internet!

Changes to JPEG Encoder

Mozjpeg

Mozjpeg is an open-source fork of libjpeg-turbo, which trades execution time for file size. This approach meshes well with the offline batch approach to regenerating images. With the investment of about 3-5x more time than libjpeg-turbo, a few more expensive algorithms make images smaller!

One of mozjpeg’s differentiators is the use of an alternative quantization table. As mentioned above, quality is an abstraction of the quantization tables used for each color channel. All signs point to the default JPEG quantization tables as being pretty easy to beat. In the words of the JPEG spec:

These tables are provided as examples only and are not necessarily suitable for any particular application.

So naturally, it shouldn’t surprise you to learn that these tables are the default used by most encoder implementations… 🤔🤔🤔

Mozjpeg has gone through the trouble of benchmarking alternative tables for us, and uses the best performing general-purpose alternative for images it creates.

Mozjpeg + Pillow

Most Linux distributions have libjpeg installed by default. So using mozjpeg under Pillow doesn’t work by default, but configuring it isn’t terribly difficult either. When you build mozjpeg, use the --with-jpeg8 flag and make sure it can be linked by Pillow will find it. If you’re using Docker, you might have a Dockerfile like:

That’s it! Build it and you’ll be able to use Pillow backed by mozjpeg within your normal images workflow.

Impact

How much did each of those improvements matter for us? We started this research by randomly sampling 2,500 of Yelp’s business photos to put through our processing pipeline and measure the impact on file size.

  1. Changes to Pillow settings were responsible for about 4.5% of the savings
  2. Large PNG detection was responsible for about 6.2% of the savings
  3. Dynamic Quality was responsible for about 4.5% of the savings
  4. Switching to the mozjpeg encoder was responsible for about 13.8% of the savings

This adds up to an average image file size reduction of around 30%, which we applied to our largest and most common image resolutions, making the website faster for users and saving terabytes a day in data transfer. As measured at the CDN:

Average filesize over time, as measured from the CDN (combined with non-image static content).

Average filesize over time, as measured from the CDN (combined with non-image static content).

What we didn’t do

This section is intended to introduce a few other common improvements that you might be able to make, that either weren’t relevant to Yelp due to defaults chosen by our tooling, or tradeoffs we chose not to make.

Subsampling

Subsampling is a major factor in determining both quality and file size for web images. Longer descriptions of subsampling can be found online, but suffice it to say for this blog post that we were already subsampling at 4:1:1 (which is Pillow’s default when nothing else is specified) so we weren’t able to realize any further savings here.

Lossy PNG encoding

After learning what we did about PNGs, choosing to preserve some of them as PNG but with a lossy encoder like pngmini could have made sense, but we chose to resave them as JPEG instead. This is an alternate option with reasonable results, 72-85% file size savings over unmodified PNGs according to the author.

Dynamic content types

Support for more modern content types like WebP or JPEG2k is certainly on our radar. Even once that hypothetical project ships, there will be a long-tail of users requesting these now-optimized JPEG/PNG images which will continue to make this effort well worth it.

SVG

We use SVG in many places on our website, like the static assets created by our designers that go into our styleguide. While this format and optimization tools like svgo are useful to reduce website page weight, it isn’t related to what we did here.

Vendor Magic

There are too many providers to list that offer image delivery / resizing / cropping / transcoding as a service. Including open-source thumbor. Maybe this is the easiest way to support responsive images, dynamic content types and remain on the cutting edge for us in the future. For now our solution remains self-contained.

Further Reading

Two books listed here absolutely stand on their own outside the context of the post, and are highly recommended as further reading on the subject.

来源: Making Photos Smaller Without Quality Loss

好烂啊有点差凑合看看还不错很精彩 (No Ratings Yet)
Loading...
72 views
Netty : what’s it

Netty : what’s it

Netty is an asynchronous event-driven network application framework
for rapid development of maintainable high performance protocol servers & clients.

 

Netty is a NIO client server framework which enables quick and easy development of network applications such as protocol servers and clients. It greatly simplifies and streamlines network programming such as TCP and UDP socket server.

‘Quick and easy’ doesn’t mean that a resulting application will suffer from a maintainability or a performance issue. Netty has been designed carefully with the experiences earned from the implementation of a lot of protocols such as FTP, SMTP, HTTP, and various binary and text-based legacy protocols. As a result, Netty has succeeded to find a way to achieve ease of development, performance, stability, and flexibility without a compromise.

Features

Design

  • Unified API for various transport types – blocking and non-blocking socket
  • Based on a flexible and extensible event model which allows clear separation of concerns
  • Highly customizable thread model – single thread, one or more thread pools such as SEDA
  • True connectionless datagram socket support (since 3.1)

Ease of use

  • Well-documented Javadoc, user guide and examples
  • No additional dependencies, JDK 5 (Netty 3.x) or 6 (Netty 4.x) is enough
    • Note: Some components such as HTTP/2 might have more requirements. Please refer to the Requirements page for more information.

Performance

  • Better throughput, lower latency
  • Less resource consumption
  • Minimized unnecessary memory copy

Security

  • Complete SSL/TLS and StartTLS support

Community

  • Release early, release often
  • The author has been writing similar frameworks since 2003 and he still finds your feed back precious!

来源: Netty: Home

跳至工具栏