服务报价 | 域名主机 | 网络营销 | 软件工具| [加入收藏]
 热线电话: #
当前位置: 主页 > 开发教程 > postgresql教程 >

PostgreSQL优化器逻辑推理能力 源码解析

时间:2016-02-27 22:03来源:未知 作者:最模板 点击:
数据库优化器需要具备逻辑推理能力,而且越强越好,为什么呢? 举一些例子, 通过已知的一个人讲的是真话,推理另一个人讲的是不是真话。 例子1: 假设预先提供了 a 10 是真话 可

数据库优化器需要具备逻辑推理能力,而且越强越好,为什么呢?

举一些例子,

通过已知的一个人讲的是真话,推理另一个人讲的是不是真话。

例子1:

假设预先提供了 a > 10 是真话

可以推理出 a < 1 一定是假话

例子2:

假设预先提供了 a > 10 是真话

无法推理出 a < 100 一定是真话或假话

例子3:

假设预先提供了 a 是空 是真话

可以推理出 a 不是空 一定是假话

例子4:

假设预先提供了 a <>100 是真话

可以推理出 a =100 一定是假话

例子5:

假设预先提供了 a >100 是真话

可以推理出 a >1 一定是真话

例子6:

假设预先提供了 a 的坐标位置在中国 是真话

可以推理出 a 的坐标位置在浙江杭州 一定是真话

例子7:

假设预先提供了 平面中 坐标A和坐标(1,100)的距离小于100 是真话

是否推理出 坐标A和坐标(100,100)的距离小于1000 一定是真话或假话?

总结一下以上逻辑推理,首先要提供已知真假的一个表达式,然后推理另一个表达式的真假。推理可以得出的结论是真、或者假、或者不知道真假。

对于推理出来的结果一定是真或者一定是假的情况,数据库可以利用它来减少后期的处理。

这体现在优化器生成查询树之前。例如:

create table tab(id int check (id >=0), info text, crt_time timestamp);  
select * from tab where id<0;  

以上已知为真的表达式是id>=0,通过这个表达式能推理出SQL中给出的表达式 id<0 一定是假。那么优化器在执行这条SQL时,可以省去扫描表然后再过滤id<0的行,而是构造结构,并直接返回0条记录。我们看看执行计划:

digoal=# create table ta(id int check (id >=0), info text, crt_time timestamp);  
CREATE TABLE  
digoal=# explain select * from ta where id=-1;  
                     QUERY PLAN                       
----------------------------------------------------  
 Seq Scan on ta  (cost=0.00..24.12 rows=6 width=44)  
   Filter: (id = '-1'::integer)  
(2 rows)  

以上查询貌似并没有优化,还是扫描了表,原因是constraint_exclusion参数默认值对UNION ALL和分区表开启这种逻辑推理检查。将constraint_exclusion 改为ON即可对所有表进行逻辑推理检查。

digoal=# set constraint_exclusion =on;  
SET  
digoal=# explain select * from ta where id=-1;  -- 现在不需要扫描表了  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  
digoal=# explain select * from ta where id<-1;  -- 现在不需要扫描表了  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  

对于无法推理出一定为假的条件,还是需要扫描表的,例如 id<>0。

postgres=# explain select * from ta where id<>0;  
                      QUERY PLAN                         
-------------------------------------------------------  
 Seq Scan on ta  (cost=0.00..24.12 rows=1124 width=44)  
   Filter: (id <> 0)  
(2 rows)  

对于提供的表达式与已知的表达式操作符左侧不一致的,目前PG的优化器没有做到这么智能,例如 id+1<10,id+1<0,优化器不会对这种表达式进行逻辑推理,后面我会在代码中分析这块。

postgres=# explain select * from ta where id+1<10;  
                      QUERY PLAN                        
------------------------------------------------------  
 Seq Scan on ta  (cost=0.00..26.95 rows=377 width=44)  
   Filter: ((id + 1) < 10)  
(2 rows)  
postgres=# explain select * from ta where id+1<0;  
                      QUERY PLAN                        
------------------------------------------------------  
 Seq Scan on ta  (cost=0.00..26.95 rows=377 width=44)  
   Filter: ((id + 1) < 0)  
(2 rows)  

id+1<0 是可以转换为 id< 0-1的 ,对于以下表达式,PG进行了推理,原因是-操作符是一个immutable操作符,0-1可以转为常数-1从而可以进行推理。

postgres=# explain select * from ta where id<0-1;  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  

目前PostgreSQL数据库支持哪些逻辑推理呢?

.1. 约束中包含的表达式的操作符必须是B-tree-indexable operators(或者is null, or , is not null),也就是可以被btree索引用于检索操作符,例如<,<=,=,>,>=以及<> (<>不能直接被索引使用,但是可以转换为< OR >来使用索引)。

B-tree-indexable operators

.2. SQL语句where字句中提供的表达式,同样操作符必须是B-tree-indexable operators。

.3. SQL语句where字句中提供的表达式,操作符左侧的操作数必须与约束中的操作数完全一致。

例如约束为(check mod(id,4) = 0),SQL where字句提供的表达式则必须为 mod(id,4) op? ? 这种形式才会进行推理。

又如约束为(check id*100 > 1000),SQL where字句提供的表达式则必须为 id*100 op? ? 这种形式才会进行推理。

又如约束为(check id+10 between 1000 and 10000),SQL where字句提供的表达式则必须为 id+10 op? ? 这种形式才会进行推理。( PostgreSQL 的 between and 会转换为>= and <=,属于B-tree-indexable operators )

又如约束为(check id between 1000 and 10000),SQL where字句提供的表达式则必须为 id op? ? 这种形式才会进行推理。

例子:约束为is [not] null类型

postgres=# create table tt1(id int check (id is null));  
CREATE TABLE  
postgres=# explain select * from tt1 where id=1;  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  
postgres=# explain select * from tt1 where id is null;  
                     QUERY PLAN                        
-----------------------------------------------------  
 Seq Scan on tt1  (cost=0.00..35.50 rows=13 width=4)  
   Filter: (id IS NULL)  
(2 rows)  
postgres=# explain select * from tt1 where id is not null;  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  

约束为 mod(id,4) = 0,=为B-tree-indexable operators

postgres=# create table tt2( id int check(mod(id,4) = 0));  
CREATE TABLE  
postgres=# explain select * from tt2 where id=1;  
                     QUERY PLAN                        
-----------------------------------------------------  
 Seq Scan on tt2  (cost=0.00..41.88 rows=13 width=4)  
   Filter: (id = 1)  
(2 rows)  
-- 要让PG进行逻辑推理,WHERE中必须包含mod(id,4)表达式,并且由于mod是immutable函数,mod(1,4)可以转换为常数,因此以下SQL相当于  
explain select * from tt2 where mod(id,4)=1 and id=1; 这样才可以被逻辑推理。  
postgres=# explain select * from tt2 where mod(id,4)=mod(1,4) and id=1;  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  

约束为 id*100 > 1000,>为B-tree-indexable operators

postgres=# create table tt3( id int check(id*100 > 1000));  
CREATE TABLE  
postgres=# explain select * from tt3 where id=1;  
                     QUERY PLAN                        
-----------------------------------------------------  
 Seq Scan on tt3  (cost=0.00..41.88 rows=13 width=4)  
   Filter: (id = 1)  
(2 rows)  
-- 要让PG进行逻辑推理,WHERE中必须包含id*100表达式,并且*是immutable操作符,所以1*100可以替换为常数。从而进行逻辑推理。  
postgres=# explain select * from tt3 where id=1 and id*100=1*100;  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  

约束为 id+10 between 1000 and 10000,between and 自动转换为>=和and <=。并且WHERE中必须包含id+10表达式,同时>=或<=是B-tree-indexable operators。

postgres=# create table tt4( id int check(id+10 between 1000 and 10000));  
CREATE TABLE  
postgres=# explain select * from tt4 where id=1;  
                     QUERY PLAN                        
-----------------------------------------------------  
 Seq Scan on tt4  (cost=0.00..41.88 rows=13 width=4)  
   Filter: (id = 1)  
(2 rows)  
postgres=# explain select * from tt4 where id=1 and id+10=1+10;  -- +是immutable操作符1+10将转换为11常数。  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  

约束为 check id between 1000 and 10000

postgres=# create table tt5( id int check(id between 1000 and 10000));  
CREATE TABLE  
postgres=# explain select * from tt5 where id=1;  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  
postgres=# explain select * from tt5 where id+1=1;  
                     QUERY PLAN                        
-----------------------------------------------------  
 Seq Scan on tt5  (cost=0.00..48.25 rows=13 width=4)  
   Filter: ((id + 1) = 1)  
(2 rows)  
postgres=# explain select * from tt5 where 1=id;  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  
postgres=# explain select * from tt5 where 1>id;  
                QUERY PLAN                  
------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0)  
   One-Time Filter: false  
(2 rows)  
postgres=# explain select * from tt5 where 1<id;  
                      QUERY PLAN                        
------------------------------------------------------  
 Seq Scan on tt5  (cost=0.00..41.88 rows=850 width=4)  
   Filter: (1 < id)  
(2 rows)  

PostgreSQL数据库是如何实现这些逻辑推理的呢?

上面的例子,都转换成了?1 op ?2,其中 ?1 是一个表达式或字段,?2是一个常数。

但是,数据库是怎么通过一个条件的真伪判断另一个条件的真伪呢?

还是回到一个例子:

check id > 100

推理 id > 1 是真是假?可以通过比较两个常数来决定,100 >= 1 为真则说明 id>1为真。

为什么要比较这两个常数呢?因为这是优化器排除对表的扫描的一种手段,这时还没有到需要用到id值的阶段。所以此时优化器只能通过常数来推理。

具体的代码如下:

目前PG只实现了对btree索引可以用到的操作符的逻辑推理,使用了两张映射表来描述推理关系。

一张表BT_implic_table 用来推理一定为真,另一张表BT_refute_table 用来推理一定为假。

例如:

已知 ATTR given_op CONST1 为真

如果 CONST2 test_op CONST1 为真

则推理得出 ATTR target_op CONST2 一定为真

其中 test_op = BT_implic_table[given_op-1][target_op-1] 就是通过BT_implic_table 映射表取出的操作符。

已知 ATTR given_op CONST1 为真

如果 CONST2 test_op CONST1 为假

则推理得出 ATTR target_op CONST2 一定为假

其中 test_op = BT_refute_table[given_op-1][target_op-1] 就是通过BT_refute_table 映射表取出的操作符。

代码:

/*  
 * Define an "operator implication table" for btree operators ("strategies"),  
 * and a similar table for refutation.  
 *  
 * The strategy numbers defined by btree indexes (see access/skey.h) are:  
 *      (1) <   (2) <=   (3) =   (4) >=   (5) >  
 * and in addition we use (6) to represent <>.  <> is not a btree-indexable  
 * operator, but we assume here that if an equality operator of a btree  
 * opfamily has a negator operator, the negator behaves as <> for the opfamily.  
 * (This convention is also known to get_op_btree_interpretation().)  
 *  
 * The interpretation of:  
 *  
 *      test_op = BT_implic_table[given_op-1][target_op-1]  
 *  
 * where test_op, given_op and target_op are strategy numbers (from 1 to 6)  
 * of btree operators, is as follows:  
 *  
 *   If you know, for some ATTR, that "ATTR given_op CONST1" is true, and you  
 *   want to determine whether "ATTR target_op CONST2" must also be true, then  
 *   you can use "CONST2 test_op CONST1" as a test.  If this test returns true,  
 *   then the target expression must be true; if the test returns false, then  
 *   the target expression may be false.  
 *  
 * For example, if clause is "Quantity > 10" and pred is "Quantity > 5"  
 * then we test "5 <= 10" which evals to true, so clause implies pred.  
 *  
 * Similarly, the interpretation of a BT_refute_table entry is:  
 *  
 *   If you know, for some ATTR, that "ATTR given_op CONST1" is true, and you  
 *   want to determine whether "ATTR target_op CONST2" must be false, then  
 *   you can use "CONST2 test_op CONST1" as a test.  If this test returns true,  
 *   then the target expression must be false; if the test returns false, then  
 *   the target expression may be true.  
 *  
 * For example, if clause is "Quantity > 10" and pred is "Quantity < 5"  
 * then we test "5 <= 10" which evals to true, so clause refutes pred.  
 *  
 * An entry where test_op == 0 means the implication cannot be determined.  
 */  

#define BTLT BTLessStrategyNumber  
#define BTLE BTLessEqualStrategyNumber  
#define BTEQ BTEqualStrategyNumber  
#define BTGE BTGreaterEqualStrategyNumber  
#define BTGT BTGreaterStrategyNumber  
#define BTNE ROWCOMPARE_NE  

static const StrategyNumber BT_implic_table[6][6] = {  
/*  
 *          The target operator:  
 *  
 *   LT    LE    EQ    GE    GT    NE  
 */  
    {BTGE, BTGE, 0, 0, 0, BTGE},    /* LT */  
    {BTGT, BTGE, 0, 0, 0, BTGT},    /* LE */  
    {BTGT, BTGE, BTEQ, BTLE, BTLT, BTNE},       /* EQ */  
    {0, 0, 0, BTLE, BTLT, BTLT},    /* GE */  
    {0, 0, 0, BTLE, BTLE, BTLE},    /* GT */  
    {0, 0, 0, 0, 0, BTEQ}       /* NE */  
};  

static const StrategyNumber BT_refute_table[6][6] = {  
/*  
 *          The target operator:  
 *  
 *   LT    LE    EQ    GE    GT    NE  
 */  
    {0, 0, BTGE, BTGE, BTGE, 0},    /* LT */  
    {0, 0, BTGT, BTGT, BTGE, 0},    /* LE */  
    {BTLE, BTLT, BTNE, BTGT, BTGE, BTEQ},       /* EQ */  
    {BTLE, BTLT, BTLT, 0, 0, 0},    /* GE */  
    {BTLE, BTLE, BTLE, 0, 0, 0},    /* GT */  
    {0, 0, BTEQ, 0, 0, 0}       /* NE */  
};  

这两个表里面的0,表示无法推断真或假的情况。例如通过 a>100 无法推断 a>? 一定为假, 只能推断 a>? 一定为真。通过100, ?, 以及 test_op 来推断,而test_op就是从BT_implic_table表中取出的BTLE即<=,因此判断的依据是 ? <= 100 为真则a>? 一定为真。

PostgreSQL通过get_btree_test_op 获得test_op,代码如下:

get_btree_test_op  
            /*  
             * Look up the "test" strategy number in the implication table  
             */  
            if (refute_it)  
                test_strategy = BT_refute_table[clause_strategy - 1][pred_strategy - 1];  
            else  
                test_strategy = BT_implic_table[clause_strategy - 1][pred_strategy - 1];  

            if (test_strategy == 0)  
            {  
                /* Can't determine implication using this interpretation */  
                continue;  
            }  
            /*  
             * See if opfamily has an operator for the test strategy and the  
             * datatypes.  
             */  
            if (test_strategy == BTNE)  
            {  
                test_op = get_opfamily_member(opfamily_id,  
                                              pred_op_info->oprighttype,  
                                              clause_op_info->oprighttype,  
                                              BTEqualStrategyNumber);  
                if (OidIsValid(test_op))  
                    test_op = get_negator(test_op);  
            }  
            else  
            {  
                test_op = get_opfamily_member(opfamily_id,  
                                              pred_op_info->oprighttype,  
                                              clause_op_info->oprighttype,  
                                              test_strategy);  
            }  

            if (!OidIsValid(test_op))  
                continue;  
...  
    return test_op;  

那么PostgreSQL可以利用这些逻辑推理来做什么呢?

通过推断 "一定为假" 来排除哪些表不需要参与到执行计划。直接排除掉。

通过推断 “一定对真” ,可以用在建立执行计划的过程中。

以一定为假为例,我们看看PostgreSQL优化器如何排除哪些表是不需要参与执行计划的。constraint_exclusion参数控制的逻辑推理应用,可以看到调用栈如下:

relation_excluded_by_constraints 返回 true 表示不需要扫描这个表,返回 false 表示需要扫描这个表。

简单分析一下这个函数的代码:

未开启constraint_exclusion时,不进行逻辑推理。

    /* Skip the test if constraint exclusion is disabled for the rel */  
    if (constraint_exclusion == CONSTRAINT_EXCLUSION_OFF ||  
        (constraint_exclusion == CONSTRAINT_EXCLUSION_PARTITION &&  
         !(rel->reloptkind == RELOPT_OTHER_MEMBER_REL ||  
           (root->hasInheritedTarget &&  
            rel->reloptkind == RELOPT_BASEREL &&  
            rel->relid == root->parse->resultRelation))))  
        return false;  

在检查表自身的约束和SQL提供的where条件前,先检查where 条件是否有自相矛盾的。例如:

 id <> mod(4,3) and id = mod(4,3)  
postgres=# \d+ tt11  
                         Table "public.tt11"  
 Column |  Type   | Modifiers | Storage | Stats target | Description   
--------+---------+-----------+---------+--------------+-------------  
 id     | integer |           | plain   |              |   

postgres=# explain (analyze,verbose) select * from tt11 where id<>mod(4,3) and id=mod(4,3);  
                                     QUERY PLAN                                       
------------------------------------------------------------------------------------  
 Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1)  
   Output: id  
   One-Time Filter: false  
 Planning time: 0.051 ms  
 Execution time: 0.012 ms  
(5 rows)  

代码如下

    /*  
     * Check for self-contradictory restriction clauses.  We dare not make  
     * deductions with non-immutable functions, but any immutable clauses that  
     * are self-contradictory allow us to conclude the scan is unnecessary.  
     *  
     * Note: strip off RestrictInfo because predicate_refuted_by() isn't  
     * expecting to see any in its predicate argument.  
     */  
    safe_restrictions = NIL;  
    foreach(lc, rel->baserestrictinfo)  
    {  
        RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);  

        if (!contain_mutable_functions((Node *) rinfo->clause))  
            safe_restrictions = lappend(safe_restrictions, rinfo->clause);  
    }  

    if (predicate_refuted_by(safe_restrictions, safe_restrictions))  
        return true;  

        // 从SQL涉及的表,以及继承表中获取约束  
    /* Only plain relations have constraints */  
    if (rte->rtekind != RTE_RELATION || rte->inh)  
        return false;  

    /*  
     * OK to fetch the constraint expressions.  Include "col IS NOT NULL"  
     * expressions for attnotnull columns, in case we can refute those.  
     */  
    constraint_pred = get_relation_constraints(root, rte->relid, rel, true);  

    /*  
     * We do not currently enforce that CHECK constraints contain only  
     * immutable functions, so it's necessary to check here. We daren't draw  
     * conclusions from plan-time evaluation of non-immutable functions. Since  
     * they're ANDed, we can just ignore any mutable constraints in the list,  
     * and reason about the rest.  
     */  
    safe_constraints = NIL;  
    foreach(lc, constraint_pred)  
    {  
        Node       *pred = (Node *) lfirst(lc);  
                         // 包含非immutable函数的表达式不加入推理判断,因为非immutable函数存在变数,不能转常量  
        if (!contain_mutable_functions(pred))    
            safe_constraints = lappend(safe_constraints, pred);  
    }  

    /*  
     * The constraints are effectively ANDed together, so we can just try to  
     * refute the entire collection at once.  This may allow us to make proofs  
     * that would fail if we took them individually.  
     *  
     * Note: we use rel->baserestrictinfo, not safe_restrictions as might seem  
     * an obvious optimization.  Some of the clauses might be OR clauses that  
     * have volatile and nonvolatile subclauses, and it's OK to make  
     * deductions with the nonvolatile parts.  
     */         
                        //   检测是否一定为假,如果一定为假,则不需要扫描这个表。  
    if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo))  
        return true;  

调用栈如下:

predicate_refuted_by  
predicate_refuted_by_recurse  
predicate_refuted_by_simple_clause  
       return btree_predicate_proof(predicate, clause, true)  
btree_predicate_proof@src/backend/optimizer/util/predtest.c  
    /*  
     * Lookup the comparison operator using the system catalogs and the  
     * operator implication tables.  
     */  
    test_op = get_btree_test_op(pred_op, clause_op, refute_it);  

目前PostgreSQL仅仅支持有限操作符的逻辑推理,这些操作符必须是btree-indexable operator

postgres=# select oprname,oprcode from pg_operator where oid in (select amopopr from pg_amop where amopmethod=(select oid from pg_am where amname='btree'));  
 oprname |         oprcode            
---------+--------------------------  
 =       | int48eq  
 <       | int48lt  
 >       | int48gt  
 <=      | int48le  
 >=      | int48ge  
 <       | boollt  
 >       | boolgt  
 =       | booleq  
 <=      | boolle  
 >=      | boolge  
 =       | chareq  
 =       | nameeq  
 =       | int2eq  
 <       | int2lt  
 =       | int4eq  
 <       | int4lt  
 =       | texteq  
 =       | tideq  
 <       | tidlt  
 >       | tidgt  
 <=      | tidle  
 >=      | tidge  
 =       | int8eq  
 <       | int8lt  
 >       | int8gt  
 <=      | int8le  
 >=      | int8ge  
 =       | int84eq  
 <       | int84lt  
 >       | int84gt  
 <=      | int84le  
 >=      | int84ge  
 >       | int2gt  
 >       | int4gt  
 <=      | int2le  
 <=      | int4le  
 >=      | int2ge  
 >=      | int4ge  
 =       | int24eq  
 =       | int42eq  
 <       | int24lt  
 <       | int42lt  
 >       | int24gt  
 >       | int42gt  
 <=      | int24le  
 <=      | int42le  
 >=      | int24ge  
 >=      | int42ge  
 =       | abstimeeq  
 <       | abstimelt  
 >       | abstimegt  
 <=      | abstimele  
 >=      | abstimege  
 =       | reltimeeq  
 <       | reltimelt  
 >       | reltimegt  
 <=      | reltimele  
 >=      | reltimege  
 =       | oideq  
 <       | oidlt  
 >       | oidgt  
 <=      | oidle  
 >=      | oidge  
 <       | oidvectorlt  
 >       | oidvectorgt  
 <=      | oidvectorle  
 >=      | oidvectorge  
 =       | oidvectoreq  
 =       | float4eq  
 <       | float4lt  
 >       | float4gt  
 <=      | float4le  
 >=      | float4ge  
 <       | charlt  
 <=      | charle  
 >       | chargt  
 >=      | charge  
 <       | namelt  
 <=      | namele  
 >       | namegt  
 >=      | namege  
 <       | text_lt  
 <=      | text_le  
 >       | text_gt  
 >=      | text_ge  
 =       | float8eq  
 <       | float8lt  
 <=      | float8le  
 >       | float8gt  
 >=      | float8ge  
 =       | tintervaleq  
 <       | tintervallt  
 >       | tintervalgt  
 <=      | tintervalle  
 >=      | tintervalge  
 =       | cash_eq  
 <       | cash_lt  
 >       | cash_gt  
 <=      | cash_le  
 >=      | cash_ge  
 =       | bpchareq  
 <       | bpcharlt  
 <=      | bpcharle  
 >       | bpchargt  
 >=      | bpcharge  
 =       | array_eq  
 <       | array_lt  
 >       | array_gt  
 <=      | array_le  
 >=      | array_ge  
 =       | date_eq  
 <       | date_lt  
 <=      | date_le  
 >       | date_gt  
 >=      | date_ge  
 =       | time_eq  
 <       | time_lt  
 <=      | time_le  
 >       | time_gt  
 >=      | time_ge  
 =       | timetz_eq  
 <       | timetz_lt  
 <=      | timetz_le  
 >       | timetz_gt  
 >=      | timetz_ge  
 =       | float48eq  
 <       | float48lt  
 >       | float48gt  
 <=      | float48le  
 >=      | float48ge  
 =       | float84eq  
 <       | float84lt  
 >       | float84gt  
 <=      | float84le  
 >=      | float84ge  
 =       | timestamptz_eq  
 <       | timestamptz_lt  
 <=      | timestamptz_le  
 >       | timestamptz_gt  
 >=      | timestamptz_ge  
 =       | interval_eq  
 <       | interval_lt  
 <=      | interval_le  
 >       | interval_gt  
 >=      | interval_ge  
 =       | macaddr_eq  
 <       | macaddr_lt  
 <=      | macaddr_le  
 >       | macaddr_gt  
 >=      | macaddr_ge  
 =       | network_eq  
 <       | network_lt  
 <=      | network_le  
 >       | network_gt  
 >=      | network_ge  
 =       | numeric_eq  
 <       | numeric_lt  
 <=      | numeric_le  
 >       | numeric_gt  
 >=      | numeric_ge  
 =       | biteq  
 <       | bitlt  
 >       | bitgt  
 <=      | bitle  
 >=      | bitge  
 =       | varbiteq  
 <       | varbitlt  
 >       | varbitgt  
 <=      | varbitle  
 >=      | varbitge  
 =       | int28eq  
 <       | int28lt  
 >       | int28gt  
 <=      | int28le  
 >=      | int28ge  
 =       | int82eq  
 <       | int82lt  
 >       | int82gt  
 <=      | int82le  
 >=      | int82ge  
 =       | byteaeq  
 <       | bytealt  
 <=      | byteale  
 >       | byteagt  
 >=      | byteage  
 =       | timestamp_eq  
 <       | timestamp_lt  
 <=      | timestamp_le  
 >       | timestamp_gt  
 >=      | timestamp_ge  
 ~<~     | text_pattern_lt  
 ~<=~    | text_pattern_le  
 ~>=~    | text_pattern_ge  
 ~>~     | text_pattern_gt  
 ~<~     | bpchar_pattern_lt  
 ~<=~    | bpchar_pattern_le  
 ~>=~    | bpchar_pattern_ge  
 ~>~     | bpchar_pattern_gt  
 <       | date_lt_timestamp  
 <=      | date_le_timestamp  
 =       | date_eq_timestamp  
 >=      | date_ge_timestamp  
 >       | date_gt_timestamp  
 <       | date_lt_timestamptz  
 <=      | date_le_timestamptz  
 =       | date_eq_timestamptz  
 >=      | date_ge_timestamptz  
 >       | date_gt_timestamptz  
 <       | timestamp_lt_date  
 <=      | timestamp_le_date  
 =       | timestamp_eq_date  
 >=      | timestamp_ge_date  
 >       | timestamp_gt_date  
 <       | timestamptz_lt_date  
 <=      | timestamptz_le_date  
 =       | timestamptz_eq_date  
 >=      | timestamptz_ge_date  
 >       | timestamptz_gt_date  
 <       | timestamp_lt_timestamptz  
 <=      | timestamp_le_timestamptz  
 =       | timestamp_eq_timestamptz  
 >=      | timestamp_ge_timestamptz  
 >       | timestamp_gt_timestamptz  
 <       | timestamptz_lt_timestamp  
 <=      | timestamptz_le_timestamp  
 =       | timestamptz_eq_timestamp  
 >=      | timestamptz_ge_timestamp  
 >       | timestamptz_gt_timestamp  
 =       | uuid_eq  
 <       | uuid_lt  
 >       | uuid_gt  
 <=      | uuid_le  
 >=      | uuid_ge  
 =       | pg_lsn_eq  
 <       | pg_lsn_lt  
 >       | pg_lsn_gt  
 <=      | pg_lsn_le  
 >=      | pg_lsn_ge  
 =       | enum_eq  
 <       | enum_lt  
 >       | enum_gt  
 <=      | enum_le  
 >=      | enum_ge  
 <       | tsvector_lt  
 <=      | tsvector_le  
 =       | tsvector_eq  
 >=      | tsvector_ge  
 >       | tsvector_gt  
 <       | tsquery_lt  
 <=      | tsquery_le  
 =       | tsquery_eq  
 >=      | tsquery_ge  
 >       | tsquery_gt  
 =       | record_eq  
 <       | record_lt  
 >       | record_gt  
 <=      | record_le  
 >=      | record_ge  
 *=      | record_image_eq  
 *<      | record_image_lt  
 *>      | record_image_gt  
 *<=     | record_image_le  
 *>=     | record_image_ge  
 =       | range_eq  
 <       | range_lt  
 <=      | range_le  
 >=      | range_ge  
 >       | range_gt  
 =       | jsonb_eq  
 <       | jsonb_lt  
 >       | jsonb_gt  
 <=      | jsonb_le  
 >=      | jsonb_ge  
(273 rows)  

除此以外的操作符,不参与逻辑推理。

例如

我们知道geo严格在坐标10,0的左边,肯定能推理出它不可能在11,0的右边,正常情况下是可以排除对这个表的扫描的。

但是由于<<,>>不是btree operator,所以不参与推理。

postgres=# create table tt13(id int, geo point check(geo << point '(10,0)'));  
CREATE TABLE  
postgres=# explain select * from tt13 where geo >> point '(11,0)';  
                       QUERY PLAN                         
--------------------------------------------------------  
 Seq Scan on tt13  (cost=0.00..31.25 rows=170 width=20)  
   Filter: (geo >> '(11,0)'::point)  
(2 rows)  

这种逻辑推理在分区表的应用中尤为突出,例如:用户规划了一批分区表,按照ID取模分区。

postgres=# create table p(id int, info text);  
CREATE TABLE  
postgres=# create table t0(id int check(abs(mod(id,4))=0), info text);  
CREATE TABLE  
postgres=# create table t1(id int check(abs(mod(id,4))=1), info text);  
CREATE TABLE  
postgres=# create table t2(id int check(abs(mod(id,4))=2), info text);  
CREATE TABLE  
postgres=# create table t3(id int check(abs(mod(id,4))=3), info text);  
CREATE TABLE  
postgres=# alter table t0 inherit p;  
ALTER TABLE  
postgres=# alter table t1 inherit p;  
ALTER TABLE  
postgres=# alter table t2 inherit p;  
ALTER TABLE  
postgres=# alter table t3 inherit p;  
ALTER TABLE  
postgres=# explain select * from p where id=0;  -- id=0 和 abs(mod(id,4)) =  0,1,2,3由于操作数不一致,不会进行推理。  
                        QUERY PLAN                          
----------------------------------------------------------  
 Append  (cost=0.00..103.50 rows=25 width=36)  
   ->  Seq Scan on p  (cost=0.00..0.00 rows=1 width=36)  
         Filter: (id = 0)  
   ->  Seq Scan on t0  (cost=0.00..25.88 rows=6 width=36)  
         Filter: (id = 0)  
   ->  Seq Scan on t1  (cost=0.00..25.88 rows=6 width=36)  
         Filter: (id = 0)  
   ->  Seq Scan on t2  (cost=0.00..25.88 rows=6 width=36)  
         Filter: (id = 0)  
   ->  Seq Scan on t3  (cost=0.00..25.88 rows=6 width=36)  
         Filter: (id = 0)  
(11 rows)  
postgres=# explain select * from p where id=0 and abs(mod(id,4)) = abs(mod(0,4));   -- 所以必须带上与约束一致的操作数  
                        QUERY PLAN                          
----------------------------------------------------------  
 Append  (cost=0.00..35.40 rows=2 width=36)  
   ->  Seq Scan on p  (cost=0.00..0.00 rows=1 width=36)  
         Filter: ((id = 0) AND (abs(mod(id, 4)) = 0))  
   ->  Seq Scan on t0  (cost=0.00..35.40 rows=1 width=36)  
         Filter: ((id = 0) AND (abs(mod(id, 4)) = 0))  
(5 rows)  

如果我们使用的是范围分区,就不存在以上的问题。因为约束中的操作数和WHERE子句中的操作数可以做到一致。

从以上的例子可以了解到,PostgreSQL优化器的逻辑推理能力还可以加强。只要能推理出一定为假的,就可以被优化器用于排除表。例如一些几何类型的操作符,数组类型的操作符等等。

(责任编辑:最模板)
顶一下
(0)
0%
踩一下
(0)
0%
------分隔线----------------------------
栏目列表
热点内容