簡介
GreenPlum 7.0.0於2023-09-28釋出,大約半年後,GreenPlum 7.1.0於2024-02-09釋出。
在本文中,麥老師就其中一些比較實用的新特性做一些簡單說明。
GreenPlum 7.1.0環境準備
1docker rm -f gpdb7
2docker run -itd --name gpdb7 -h gpdb7 \
3 -p 5437:5432 -p 28087:28080 \
4 -v /sys/fs/cgroup:/sys/fs/cgroup \
5--privileged=true lhrbest/greenplum:7.1.0 \
6 /usr/sbin/init
7
8docker exec -it gpdb7 bash
9su - gpadmin
10gpstart -a
11gpcc start
12
13gpcc status
14gpstate
此docker包括1個master,1個standby master,2個segment,2個mirror例項;還包括gpcc 7.0.0
新特性實驗
VMware Greenplum 7.1.0引入了tablefunc模組,提供了各種返回表的函式範例,包括行轉列等功能
tablefunc
模組包括多個返回表(也就是多行)的函式。這些函式都很有用,並且也可以作為如何編寫返回多行的 C 函式的例子。
範例可以參考:https://www.postgresql.org/docs/12/tablefunc.html
http://postgres.cn/docs/12/tablefunc.html
函式 | 返回 | 描述 |
---|---|---|
normal_rand(int numvals, float8 mean, float8 stddev)
|
setof float8
| 產生一個正態分布的隨機值集合 |
crosstab(text sql)
|
setof record
|
產生一個包含行名稱外加
N
個值列的「數據透視表」,其中
N
由呼叫查詢中指定的行型別決定
|
crosstab*
N
*(text sql)
|
setof table_crosstab_*
N
*
|
產生一個包含行名稱外加
N
個值列的「數據透視表」。
crosstab2
、
crosstab3
和
crosstab4
是被預定義的,但你可以按照下文所述建立額外的
crosstab*
N
*
函式
|
crosstab(text source_sql, text category_sql)
|
setof record
| 產生一個「數據透視表」,其值列由第二個查詢指定 |
crosstab(text sql, int N)
|
setof record
|
crosstab(text)
的廢棄版本。參數
N
現在被忽略,因為值列的數量總是由呼叫查詢所決定
|
connectby(text relname, text keyid_fld, text parent_keyid_fld [, text orderby_fld ], text start_with, int max_depth [, text branch_delim ])
|
setof record
| 產生一個層次樹結構的表達 |
1db1=# CREATE EXTENSION tablefunc;
2CREATE EXTENSION
3db1=# SELECT * FROM normal_rand(1000, 5, 3);
4 normal_rand
5----------------------
62.3210274434791187
71.231076402857033
8-0.8117263529261152
9-1.2934824713330597
108.292221876591267
113.804515144372151
121.9176029752768766
137.146218652634886
143.551605912228543
155.575493201208664
166.666709079414525
172.5228426084040176
186.407538689302069
195.8016036456658995
204.277014091604118
215.780894470091546
225.750904724932745
235.753381245096707
242.4427467584795792
256.81576512005292
268.192744936276732
276.614708709243898
288.77794265411034
295.791113475048419
305.70369412214234
314.327753473864319
327.570550167961118
333.5597661002608407
348.046435727461073
359.658108512543121
366.470092796527577
377.666408022086054
38db1=#
39db1=#
40db1=#
41db1=# CREATE TABLE ct(id SERIAL, rowid TEXT, attribute TEXT, value TEXT);
42NOTICE: Table doesn't have 'DISTRIBUTEDBY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
43HINT: The 'DISTRIBUTEDBY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
44INSERT INTO ct(rowid, attribute, value) VALUES('test1','att1','val1');
45CREATE TABLE
46db1=# INSERT INTO ct(rowid, attribute, value) VALUES('test1','att1','val1');
47INSERT 0 1
48db1=# INSERT INTO ct(rowid, attribute, value) VALUES('test1','att2','val2');
49INSERT 0 1
50db1=# INSERT INTO ct(rowid, attribute, value) VALUES('test1','att3','val3');
51INSERT 0 1
52db1=# INSERT INTO ct(rowid, attribute, value) VALUES('test1','att4','val4');
53INSERT 0 1
54db1=# INSERT INTO ct(rowid, attribute, value) VALUES('test2','att1','val5');
55INSERT 0 1
56db1=# INSERT INTO ct(rowid, attribute, value) VALUES('test2','att2','val6');
57INSERT 0 1
58db1=# INSERT INTO ct(rowid, attribute, value) VALUES('test2','att3','val7');
59INSERT 0 1
60db1=# INSERT INTO ct(rowid, attribute, value) VALUES('test2','att4','val8');
61INSERT 0 1
62db1=#
63db1=# SELECT *
64db1-# FROM crosstab(
65db1(# 'selectrowid, attribute, value
66db1'# from ct
67db1'# where attribute = ''att2'' or attribute = ''att3''
68db1'# order by 1,2')
69db1-# AS ct(row_name text, category_1 text, category_2 text, category_3 text);
70 row_name | category_1 | category_2 | category_3
71----------+------------+------------+------------
72 test1 | val2 | val3 |
73 test2 | val6 | val7 |
74(2rows)
75
76db1=# create table sales(year int, month int, qty int);
77NOTICE: Table doesn't have 'DISTRIBUTEDBY' clause -- Using column named 'year' as the Greenplum Database data distribution key for this table.
78HINT: The 'DISTRIBUTEDBY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
79CREATE TABLE
80db1=# insert into sales values(2007, 1, 1000);
81INSERT 0 1
82db1=# insert into sales values(2007, 2, 1500);
83INSERT 0 1
84db1=# insert into sales values(2007, 7, 500);
85INSERT 0 1
86db1=# insert into sales values(2007, 11, 1500);
87INSERT 0 1
88db1=# insert into sales values(2007, 12, 2000);
89INSERT 0 1
90db1=# insert into sales values(2008, 1, 1000);
91INSERT 0 1
92db1=#
93db1=# select * from crosstab(
94db1(# 'selectyear, month, qty from sales orderby1',
95db1(# 'select m from generate_series(1,12) m'
96db1(# ) as (
97db1(# year int,
98db1(# "Jan" int,
99db1(# "Feb" int,
100db1(# "Mar" int,
101db1(# "Apr" int,
102db1(# "May" int,
103db1(# "Jun" int,
104db1(# "Jul" int,
105db1(# "Aug" int,
106db1(# "Sep" int,
107db1(# "Oct" int,
108db1(# "Nov" int,
109db1(# "Dec" int
110db1(# );
111 year | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec
112------+------+------+-----+-----+-----+-----+-----+-----+-----+-----+------+------
113 2007 | 1000 | 1500 | | | | | 500 | | | | 1500 | 2000
114 2008 | 1000 | | | | | | | | | | |
115(2 rows)
116
117db1=# CREATE TABLE cth(rowid text, rowdt timestamp, attribute text, val text);
118NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'rowid' as the Greenplum Database data distribution key for this table.
119HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimaldata distribution keyto minimize skew.
120CREATETABLE
121db1=# INSERT INTO cth VALUES('test1','01 March 2003','temperature','42');
122INSERT01
123db1=# INSERT INTO cth VALUES('test1','01 March 2003','test_result','PASS');
124INSERT01
125db1=# INSERT INTO cth VALUES('test1','01 March 2003','volts','2.6987');
126INSERT01
127db1=# INSERT INTO cth VALUES('test2','02 March 2003','temperature','53');
128INSERT01
129db1=# INSERT INTO cth VALUES('test2','02 March 2003','test_result','FAIL');
130INSERT01
131db1=# INSERT INTO cth VALUES('test2','02 March 2003','test_startdate','01 March 2003');
132INSERT01
133db1=# INSERT INTO cth VALUES('test2','02 March 2003','volts','3.1234');
134INSERT01
135db1=#
136db1=# SELECT * FROM crosstab
137db1-# (
138db1(# 'SELECT rowid, rowdt, attribute, val FROM cth ORDER BY 1',
139db1(# 'SELECT DISTINCT attribute FROM cth ORDER BY 1'
140db1(# )
141db1-# AS
142db1-# (
143db1(# rowid text,
144db1(# rowdt timestamp,
145db1(# temperature int4,
146db1(# test_result text,
147db1(# test_startdate timestamp,
148db1(# volts float8
149db1(# );
150rowid | rowdt | temperature | test_result | test_startdate | volts
151-------+---------------------+-------------+-------------+---------------------+--------
152 test1 | 2003-03-0100:00:00 | 42 | PASS | | 2.6987
153 test2 | 2003-03-0200:00:00 | 53 | FAIL | 2003-03-0100:00:00 | 3.1234
154(2rows)
155
156db1=#
新增pg_buffercache和gp_buffercache檢視
VMware Greenplum包括一個新的擴充套件程式 - pg_buffercache -,允許使用者存取五個檢視以獲取集群範圍的共享緩沖區指標:gp_buffercache、gp_buffercache_summary、gp_buffercache_usage_counts、gp_buffercache_summary_aggregated和gp_buffercache_usage_counts_aggregated。
該特性在GreenPlum 6.26.2中已提供,不過提供的檢視較少。 可以參考:https://www.xmmup.com/greenplum-6262banbenxintexingshuoming.html
1[gpadmin@gpdb7 ~]$ psql
2psql (12.12)
3Type "help" for help.
4
5postgres=# select version();
6 version
7----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
8 PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-18), 64-bit compiled on Jan 19 2024 06:39:45 Bhuvnesh C.
9(1 row)
10
11postgres=# create database db1;
12CREATE DATABASE
13postgres=# \c db1
14You are now connected to database "db1" as user "gpadmin".
15db1=# create extension pg_buffercache;
16CREATE EXTENSION
17db1=# select count(*) from gp_buffercache;
18 count
19-------
20 12000
21(1 row)
22
23db1=# select count(*) from pg_buffercache;
24 count
25-------
26 4000
27(1 row)
28
29db1=# select * from gp_buffercache limit 6;
30 gp_segment_id | bufferid | relfilenode | reltablespace | reldatabase | relforknumber | relblocknumber | isdirty | usagecount | pinning_backends
31---------------+----------+-------------+---------------+-------------+---------------+----------------+---------+------------+------------------
32 -1 | 1 | 13721 | 1664 | 0 | 0 | 0 | f | 5 | 0
33 -1 | 2 | 1259 | 1663 | 13720 | 0 | 0 | f | 5 | 0
34 -1 | 3 | 1259 | 1663 | 13720 | 0 | 1 | f | 5 | 0
35 -1 | 4 | 1249 | 1663 | 13720 | 0 | 0 | f | 5 | 0
36 -1 | 5 | 1249 | 1663 | 13720 | 0 | 1 | f | 5 | 0
37 -1 | 6 | 1249 | 1663 | 13720 | 0 | 2 | f | 5 | 0
38(6 rows)
39
40db1=#
41db1=# SELECT n.nspname, c.relname, count(*) AS buffers
42db1-# FROM pg_buffercache b JOIN pg_ class c
43db1-# ON b.relfilenode = pg_relation_filenode(c.oid) AND
44db1-# b.reldatabase IN (0, (SELECT oid FROM pg_database
45db1(# WHERE datname = current_database()))
46db1-# JOIN pg_namespace n ON n.oid = c.relnamespace
47db1-# GROUP BY n.nspname, c.relname
48db1-# ORDER BY 3 DESC
49db1-# LIMIT 10;
50 nspname | relname | buffers
51------------+--------------------------------+---------
52 pg_catalog | pg_proc | 14
53 pg_catalog | pg_depend_reference_index | 13
54 pg_catalog | pg_attribute | 12
55 pg_catalog | pg_depend | 11
56 pg_catalog | pg_ class | 11
57 pg_catalog | pg_rewrite | 7
58 pg_catalog | pg_type | 7
59 pg_catalog | pg_proc_proname_args_nsp_index | 7
60 pg_catalog | pg_init_privs | 6
61 pg_catalog | pg_authid | 5
62(10 rows)
63
64db1=# select count(*) from gp_buffercache_summary;
65 count
66-------
67 3
68(1 row)
69
70db1=# select * from gp_buffercache_summary;
71 gp_segment_id | buffers_used | buffers_unused | buffers_dirty | buffers_pinned | usagecount_avg
72---------------+--------------+----------------+---------------+----------------+--------------------
73 -1 | 1562 | 2438 | 120 | 0 | 3.881562099871959
74 0 | 1489 | 2511 | 117 | 0 | 3.4976494291470788
75 1 | 1493 | 2507 | 119 | 0 | 3.495646349631614
76(3 rows)
77
78db1=# select * from gp_buffercache_usage_counts;
79 gp_segment_id | usage_count | buffers | dirty | pinned
80---------------+-------------+---------+-------+--------
81 -1 | 0 | 2438 | 0 | 0
82 -1 | 1 | 228 | 5 | 0
83 -1 | 2 | 240 | 8 | 0
84 -1 | 3 | 49 | 8 | 0
85 -1 | 4 | 17 | 1 | 0
86 -1 | 5 | 1028 | 98 | 0
87 0 | 0 | 2509 | 0 | 0
88 0 | 1 | 444 | 6 | 0
89 0 | 2 | 123 | 6 | 0
90 0 | 3 | 39 | 7 | 0
91 0 | 4 | 17 | 2 | 0
92 0 | 5 | 868 | 97 | 0
93 1 | 0 | 2505 | 0 | 0
94 1 | 1 | 446 | 6 | 0
95 1 | 2 | 123 | 6 | 0
96 1 | 3 | 39 | 7 | 0
97 1 | 4 | 18 | 2 | 0
98 1 | 5 | 869 | 100 | 0
99(18 rows)
100
101db1=# select * from gp_buffercache_summary_aggregated;
102 buffers_used | buffers_unused | buffers_dirty | buffers_pinned | usagecount_avg
103--------------+----------------+---------------+----------------+-------------------
104 4550 | 7450 | 359 | 0 | 3.625432361146132
105(1 row)
106
107db1=# select * from gp_buffercache_usage_counts_aggregated;
108 usage_count | buffers | dirty | pinned
109-------------+---------+-------+--------
110 45 | 12000 | 359 | 0
111(1 row)
112
113db1=#
114
孤兒檔相關
gp_toolkit模式中的gp_check_orphaned_files檢視包含一個新列 - filepath -,用於打印孤立檔的相對/絕對路徑。
VMware Greenplum 7.1.0在gp_toolkit管理模式中添加了gp_move_orphaned_files使用者定義函式(UDF),該函式將gp_check_orphaned_files檢視找到的孤立檔移動到您指定的檔案系統位置。
參考:https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/ref_guide-gp_toolkit.html#moveorphanfiles
分區表相關
gp_toolkit管理模式現在包括一些用於輔助分區維護的物件:一個新檢視 - gp_partitions,以及幾個新的使用者定義函式,包括:pg_partition_rank()、pg_partition_range_from()、pg_partition_range_to()、pg_partition_bound_value()、pg_partition_isdefault()、pg_partition_lowest_child()和pg_partition_highest_child()。有關詳細資訊,請參閱gp_toolkit管理模式主題。
可以參考:https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/ref_guide-gp_toolkit.html
pg_filedump程式
參考:https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/utility_guide-ref-pg_filedump.html?hWord=N4IghgNiBcIA4HMD6AzAlhApgEwK4Fs4QBfIA
VMware Greenplum引入了一個新實用程式 - pg_filedump -,允許您讀取格式化內容的VMware Greenplum數據檔,包括表、索引和控制檔。
The
pg_filedump
utility formats VMware Greenplum data files -- including table, index and control files -- into a human-readable format.
To use
pg_filedump
, you must have:
gpsupport
1.0.3 or higher installed
a search path that includes the
gpsupport
executable path
NOTE
pg_filedump
is currently only supported for Greenplum 7 data files.
1[gpadmin@gpdb7 18444]$ pg_filedump 9926
2
3*******************************************************************
4* PostgreSQL File/Block Formatted Dump Utility
5*
6* File: 9926
7* Options used: None
8*******************************************************************
9
10Block 0 ********************************************************
11<Header> -----
12 Block Offset: 0x00000000 Offsets: Lower 64 (0x0040)
13 Block: Size 32768 Version 14 Upper 32752 (0x7ff0)
14 LSN: logid 0 recoff 0x046f5240 Special 32752 (0x7ff0)
15 Items: 10 Free Space: 32688
16Checksum: 0x0496 Prune XID: 0x00000000 Flags: 0x0000 ()
17Length (including item array): 64
18
19 BTree Meta Data: Magic (0x00053162) Version (4)
20 Root: Block (0) Level (0)
21 FastRoot: Block (0) Level (0)
22
23<Special p> -----
24 BTree Indexp:
25 Flags: 0x0008 (META)
26 Blocks: Previous (0) Next (0) Level (0) CycleId (0)
27
28
29*** EndofFile Encountered. LastBlockRead: 0 ***
30
31[gpadmin@gpdb7 mirror]$ find ./ -name pg_control
32./gpseg0/global/pg_control
33./gpseg1/global/pg_control
34
35[gpadmin@gpdb7 mirror]$ pg_filedump -c ./gpseg0/global/pg_control
36
37*******************************************************************
38* PostgreSQL File/Block Formatted Dump Utility
39*
40* File: ./gpseg0/global/pg_control
41* Options used: -c
42*******************************************************************
43
44<pg_control Contents> *********************************************
45
46 CRC: Correct
47 pg_control Version: 12010700
48CatalogVersion: 302307241
49System Identifier: 7287791898375007577
50 State: INARCHIVERECOVERY
51LastModTime: Sun Feb 1811:11:482024
52Last Checkpoint Record: LogFile (0) Offset (0x0cf18ca8)
53Last Checkpoint RecordRedo: LogFile (0) Offset (0x0cf18b50)
54 |- TimeLineID: 1
55 |- Next XID: 0/2060
56 |- NextOID: 26549
57 |- Next Relfilenode: 25699
58 |- Next Multi: 1
59 |- Next MultiOff: 0
60 |- Time: Sun Feb 1811:11:482024
61MinimumRecovery Point: LogFile (0) Offset (0x0cfa18c0)
62BackupStartRecord: LogFile (0) Offset (0x00000000)
63BackupEndRecord: LogFile (0) Offset (0x00000000)
64End-of-BackupRecordRequired: no
65 Maximum Data Alignment: 8
66 Floating-Point Sample: 1234567
67DatabaseBlockSize: 32768
68 Blocks Per Segment: 32768
69 XLOG BlockSize: 32768
70 XLOG SegmentSize: 67108864
71 Maximum Identifier Length: 64
72 Maximum IndexKeys: 32
73 TOAST ChunkSize: 8140
故障恢復gprecoverseg
當使用輸入配置檔(gprecoverseg -i)時,VMware Greenplum現在支持差異段恢復。此外,您現在可以在傳遞給gprecoverseg -i的recover_config_file中的條目之前添加I、D或F來指示段恢復的型別。 在 GreenPlum 6.25.0中也提供了差異化恢復
1recoveryType field supports below values:
2 I/i for incremental recovery
3 D/d for differential recovery
4 F/f for full recovery
EXPLAIN ANALYZE增強
當使用BUFFERS關鍵字時,EXPLAIN ANALYZE現在顯示緩沖區使用情況和I/O時間。
1postgres=# \h explain analyze
2Command: EXPLAIN
3Description: show the execution plan of a statement
4Syntax:
5EXPLAIN [ ( option [, ...] ) ] statement
6EXPLAIN [ ANALYZE ] [ VERBOSE ] statement
7
8whereoption can be one of:
9
10ANALYZE [ boolean ]
11 VERBOSE [ boolean ]
12 COSTS [ boolean ]
13SETTINGS [ boolean ]
14 BUFFERS [ boolean ]
15 TIMING [ boolean ]
16 SUMMARY [ boolean ]
17FORMAT { TEXT | XML | JSON | YAML }
18
19URL: https://www.postgresql.org/docs/12/sql-explain.html
20
21
22postgres=# EXPLAIN (ANALYZE) select * from pg_tables;
23QUERY PLAN
24---------------------------------------------------------------------------------------------------------------------------
25HashLeftJoin (cost=2.25..19.63rows=91 width=260) (actual time=0.181..0.407rows=91 loops=1)
26Hash Cond: (c.reltablespace = t.oid)
27 Extra Text: Hashchainlength1.0avg, 1max, using2of65536 buckets.
28 -> HashLeftJoin (cost=1.20..17.15rows=91 width=140) (actual time=0.114..0.280rows=91 loops=1)
29Hash Cond: (c.relnamespace = n.oid)
30 Extra Text: Hashchainlength1.0avg, 1max, using9of65536 buckets.
31 -> Seq Scanon pg_ class c (cost=0.00..14.80rows=91 width=80) (actual time=0.043..0.125rows=91 loops=1)
32 Filter: (relkind = ANY ('{r,p}'::"char"[]))
33Rows Removed by Filter: 533
34 -> Hash (cost=1.09..1.09rows=9 width=68) (actual time=0.009..0.010rows=9 loops=1)
35 Buckets: 65536 Batches: 1MemoryUsage: 513kB
36 -> Seq Scanon pg_namespace n (cost=0.00..1.09rows=9 width=68) (actual time=0.004..0.005rows=9 loops=1)
37 -> Hash (cost=1.02..1.02rows=2 width=68) (actual time=0.004..0.004rows=2 loops=1)
38 Buckets: 65536 Batches: 1MemoryUsage: 513kB
39 -> Seq Scanon pg_tablespace t (cost=0.00..1.02rows=2 width=68) (actual time=0.002..0.003rows=2 loops=1)
40 Optimizer: Postgres-based planner
41 Planning Time: 0.783 ms
42 (slice0) Executor memory: 1131K bytes. Work_mem: 513K bytes max.
43Memory used: 128000kB
44 Execution Time: 0.462 ms
45(20rows)
46
47
48
49postgres=# EXPLAIN (ANALYZE, BUFFERS) select * from pg_tables;
50QUERY PLAN
51---------------------------------------------------------------------------------------------------------------------------
52HashLeftJoin (cost=2.25..19.63rows=91 width=260) (actual time=0.438..0.726rows=91 loops=1)
53Hash Cond: (c.reltablespace = t.oid)
54 Extra Text: Hashchainlength1.0avg, 1max, using2of65536 buckets.
55 Buffers: shared hit=9
56 -> HashLeftJoin (cost=1.20..17.15rows=91 width=140) (actual time=0.149..0.341rows=91 loops=1)
57Hash Cond: (c.relnamespace = n.oid)
58 Extra Text: Hashchainlength1.0avg, 1max, using9of65536 buckets.
59 Buffers: shared hit=8
60 -> Seq Scanon pg_ class c (cost=0.00..14.80rows=91 width=80) (actual time=0.060..0.140rows=91 loops=1)
61 Filter: (relkind = ANY ('{r,p}'::"char"[]))
62Rows Removed by Filter: 533
63 Buffers: shared hit=7
64 -> Hash (cost=1.09..1.09rows=9 width=68) (actual time=0.012..0.013rows=9 loops=1)
65 Buckets: 65536 Batches: 1MemoryUsage: 513kB
66 Buffers: shared hit=1
67 -> Seq Scanon pg_namespace n (cost=0.00..1.09rows=9 width=68) (actual time=0.005..0.006rows=9 loops=1)
68 Buffers: shared hit=1
69 -> Hash (cost=1.02..1.02rows=2 width=68) (actual time=0.006..0.006rows=2 loops=1)
70 Buckets: 65536 Batches: 1MemoryUsage: 513kB
71 Buffers: shared hit=1
72 -> Seq Scanon pg_tablespace t (cost=0.00..1.02rows=2 width=68) (actual time=0.003..0.004rows=2 loops=1)
73 Buffers: shared hit=1
74 Optimizer: Postgres-based planner
75 Planning Time: 0.878 ms
76 (slice0) Executor memory: 1131K bytes. Work_mem: 513K bytes max.
77Memory used: 128000kB
78 Execution Time: 0.811 ms
79(27rows)
gppkg增強
gppkg實用程式選項 -f 現在可幫助刪除具有不完整或缺失檔的軟體包。
1[gpadmin@gpdb7 gppkg]$ gppkg install MetricsCollector-7.0.0_gp_7.0.0-rocky8-x86_64.gppkg
2Detecting network topology: [==============================================================] [OK]
32 coordinators and4segment instances are detected on1unique host.
4Distributing package: [==============================================================] [OK]
5Decoding package: [==============================================================] [OK]
6Verifying package installation:[==============================================================] [OK]
7Verifying package integrity: [==============================================================] [OK]
8You are going toinstall the following packages:
9Install'[email protected]_gp_7.0.0'
10Continue? [y/N] y
11Allocating disk space: [================X ] [ERROR]
12Cleanup: [==============================================================] [OK]
13Error: from gpdb7: IoError(file'/usr/local/greenplum-db-7.1.0/lib/postgresql/metrics_collector.so'existsin the filesystem
14
15Caused by:
16 entity already exists)
17
18
19[gpadmin@gpdb7 gppkg]$ ll /usr/local/greenplum-db-7.1.0/lib/postgresql/metrics_collector.so
20-rwxr-xr-x 1 gpadmin gpadmin 3570904 Jan 3114:51 /usr/local/greenplum-db-7.1.0/lib/postgresql/metrics_collector.so
21
22
23[gpadmin@gpdb7 gppkg]$ gppkg install MetricsCollector-7.0.0_gp_7.0.0-rocky8-x86_64.gppkg -f
24Detecting network topology: [==============================================================] [OK]
252 coordinators and4segment instances are detected on1unique host.
26Distributing package: [==============================================================] [OK]
27Decoding package: [==============================================================] [OK]
28Verifying package installation:[==============================================================] [OK]
29Verifying package integrity: [==============================================================] [OK]
30You are going toinstall the following packages:
31Install'[email protected]_gp_7.0.0'
32Continue? [y/N] y
33Allocating disk space: [==============================================================] [OK]
34Install'MetricsCollector': [==============================================================] [OK]
35The stdout from the script of the post-install: ] 0.0
36-
37==========================================================================
38Metrics Collector installation iscomplete!
39==========================================================================
40
41
42Running post-install hook: [==============================================================] [OK]
43Result:
44 MetricsCollector has been successfully installed
45Clean Up: [==============================================================] [OK]
系統檢視gp_stat_progress_dtx_recovery
系統檢視gp_stat_progress_dtx_recovery顯示了分布式事務(DTX)恢復過程的進度,這可能對監視協調器崩潰後的恢復狀態很有用。
1[gpadmin@gpdb7 ~]$ ps -ef|grep post | grep bin
2gpadmin 1204 1 0 10:15 ? 00:00:01 /usr/local/greenplum-db-7.1.0/bin/postgres -D /opt/greenplum/data/primary/gpseg0 -c gp_role=execute
3gpadmin 12091010:15 ? 00:00:01 /usr/local/greenplum-db-7.1.0/bin/postgres -D /opt/greenplum/data/primary/gpseg1 -c gp_role=execute
4gpadmin 12430010:15 ? 00:00:01 /usr/local/greenplum-db-7.1.0/bin/postgres -D /opt/greenplum/data/master/gpseg-1 -c gp_role=dispatch
5gpadmin 13931010:15 ? 00:00:00 /usr/local/greenplum-db-7.1.0/bin/postgres -D /opt/greenplum/data/master_standby/gpseg-1 -c gp_role=dispatch
6gpadmin 45251010:16 ? 00:00:00 /usr/local/greenplum-db-7.1.0/bin/postgres -D /opt/greenplum/data/mirror/gpseg0 -c gp_role=execute
7gpadmin 45261010:16 ? 00:00:00 /usr/local/greenplum-db-7.1.0/bin/postgres -D /opt/greenplum/data/mirror/gpseg1 -c gp_role=execute
8[gpadmin@gpdb7 ~]$ kill-91209
9[gpadmin@gpdb7 ~]$ psql
10psql (12.12)
11Type"help"for help.
12
13postgres=# select * from gp_stat_progress_dtx_recovery;
14 phase | recover_commited_dtx_total | recover_commited_dtx_completed | in_doubt_tx_total | in_doubt_tx_in_progress | in_doubt_tx_aborted
15-------+----------------------------+--------------------------------+-------------------+-------------------------+---------------------
16(0rows)
17
18postgres=# select * from gp_stat_progress_dtx_recovery;
19 phase | recover_commited_dtx_total | recover_commited_dtx_completed | in_doubt_tx_total | in_doubt_tx_in_progress | in_doubt_tx_aborted
20------------------------------------------+----------------------------+--------------------------------+-------------------+-------------------------+---------------------
21 gathering in-doubt orphaned transactions | 0 | 0 | 0 | 0 | 0
22(1row)
23
24postgres=# select * from gp_stat_progress_dtx_recovery;
25 phase | recover_commited_dtx_total | recover_commited_dtx_completed | in_doubt_tx_total | in_doubt_tx_in_progress | in_doubt_tx_aborted
26-------+----------------------------+--------------------------------+-------------------+-------------------------+---------------------
27(0rows)
28postgres=# select * from gp_segment_configuration ;
29 dbid | content | role | preferred_role | mode | status | port | hostname | address | datadir
30------+---------+------+----------------+------+--------+------+----------+---------+--------------------------------------------
311 | -1 | p | p | n | u | 5432 | gpdb7 | gpdb7 | /opt/greenplum/data/master/gpseg-1
323 | 1 | m | p | n | d | 6001 | gpdb7 | gpdb7 | /opt/greenplum/data/primary/gpseg1
335 | 1 | p | m | n | u | 7001 | gpdb7 | gpdb7 | /opt/greenplum/data/mirror/gpseg1
346 | -1 | m | m | s | u | 5433 | gpdb7 | gpdb7 | /opt/greenplum/data/master_standby/gpseg-1
352 | 0 | p | p | s | u | 6000 | gpdb7 | gpdb7 | /opt/greenplum/data/primary/gpseg0
364 | 0 | m | m | s | u | 7000 | gpdb7 | gpdb7 | /opt/greenplum/data/mirror/gpseg0
37(6rows)
38
39postgres=#
log_directory配置日誌位置
您現在可以使用伺服器配置參數log_directory手動配置VMware Greenplum日誌的位置。gpsupport實用程式還支持從由此伺服器配置參數設定的目錄中收集日誌。
1-- GPDB 7.1.0 ,日誌預設位於log目錄,/opt/greenplum/data/master/gpseg-1/log/
2[gpadmin@gpdb7 ~]$ gpconfig -s log_directory
3Values on all segments are consistent
4GUC : log_directory
5Coordinator value: log
6Segment value: log
7
8-- GPDB 6.26,日誌預設位於pg_log目錄
9[gpadmin@gpdb6261 ~]$ gpconfig -s log_directory
10Values on all segments are consistent
11GUC : log_directory
12Master value: pg_log
13Segment value: pg_log
14[gpadmin@gpdb6261 ~]$
新增optimizer_enable_right_outer_join伺服器配置參數
新的optimizer_enable_right_outer_join伺服器配置參數允許您控制GPORCA是否生成右外連線。在觀察到與右外連線相關的效能不佳的情況下,您可以選擇禁止使用它們。 該特性在GreenPlum 6.26.2中已提供。 可以參考:https://www.xmmup.com/greenplum-6262banbenxintexingshuoming.html
1[gpadmin@gpdb7 ~]$ gpconfig -s optimizer_enable_right_outer_join
2Values on all segments are consistent
3GUC : optimizer_enable_right_outer_join
4Coordinator value: on
5Segment value: on
6[gpadmin@gpdb7 ~]$
VACUUM命令現在包含了SKIP_DATABASE_STATS和ONLY_DATABASE_STATS子句
1postgres=# \h vacuum
2Command: VACUUM
3Description: garbage-collect and optionally analyze a database
4Syntax:
5VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
6VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ AO_AUX_ONLY ] [ ANALYZE ] [ table_and_columns [, ...] ]
7
8whereoption can be one of:
9
10FULL [ boolean ]
11 FREEZE [ boolean ]
12 VERBOSE [ boolean ]
13 AO_AUX_ONLY [ boolean ]
14ANALYZE [ boolean ]
15 DISABLE_PAGE_SKIPPING [ boolean ]
16 SKIP_LOCKED [ boolean ]
17 INDEX_CLEANUP [ boolean ]
18TRUNCATE [ boolean ]
19 SKIP_DATABASE_STATS [ boolean ]
20 ONLY_DATABASE_STATS [ boolean ]
21
22and table_and_columns is:
23
24 table_name [ ( column_name [, ...] ) ]
25
26URL: https://www.postgresql.org/docs/12/sql-vacuum.html
pg_config命令的輸出現在包括了Greenplum版本資訊。
1[gpadmin@gpdb7 ~]$ which pg_config
2/usr/local/greenplum-db-7.1.0/bin/pg_config
3[gpadmin@gpdb7 ~]$ pg_config
4BINDIR = /usr/local/greenplum-db-7.1.0/bin
5DOCDIR = /usr/local/greenplum-db-7.1.0/share/doc/postgresql
6HTMLDIR = /usr/local/greenplum-db-7.1.0/share/doc/postgresql
7INCLUDEDIR = /usr/local/greenplum-db-7.1.0/include
8PKGINCLUDEDIR = /usr/local/greenplum-db-7.1.0/include/postgresql
9INCLUDEDIR-SERVER = /usr/local/greenplum-db-7.1.0/include/postgresql/server
10LIBDIR = /usr/local/greenplum-db-7.1.0/lib
11PKGLIBDIR = /usr/local/greenplum-db-7.1.0/lib/postgresql
12LOCALEDIR = /usr/local/greenplum-db-7.1.0/share/locale
13MANDIR = /usr/local/greenplum-db-7.1.0/man
14SHAREDIR = /usr/local/greenplum-db-7.1.0/share/postgresql
15SYSCONFDIR = /usr/local/greenplum-db-7.1.0/etc/postgresql
16PGXS = /usr/local/greenplum-db-7.1.0/lib/postgresql/pgxs/src/makefiles/pgxs.mk
17CONFIGURE = '--with-gssapi' '--enable-orafce' '--enable-orca' '--enable-gpcloud' '--with-libxml' '--with-openssl' '--with-pam' '--with-ldap' '--with-uuid=e2fs' '--with-llvm' '--with-pgport=5432' '--disable-debug-extensions' '--disable-tap-tests' '--enable-ic-proxy' '--with-perl' '--with-python' 'PYTHON=python3.9' '--with-includes=/tmp/build/60664f00/gpdb_src/gpAux/ext/rocky8_x86_64/include /tmp/build/60664f00/gpdb_src/gpAux/ext/rocky8_x86_64/include/libxml2' '--with-libraries=/tmp/build/60664f00/gpdb_src/gpAux/ext/rocky8_x86_64/lib' '--disable-rpath' 'LDFLAGS=-Wl,--enable-new-dtags -Wl,-rpath,$ORIGIN/../lib' '--prefix=/usr/local/greenplum-db-devel' '--mandir=/usr/local/greenplum-db-devel/man' 'CFLAGS=-O3 -fargument-noalias-global -fno-omit-frame-pointer -g' 'PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig'
18CC = gcc
19CPPFLAGS = -D_GNU_SOURCE -I/usr/include/libxml2
20CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-but-set-variable -Werror=implicit-fallthrough=3 -Wno-format-truncation -Wno-stringop-truncation -O3 -fargument-noalias-global -fno-omit-frame-pointer -g -Werror=uninitialized -Werror=implicit-function-declaration
21CFLAGS_SL = -fPIC
22LDFLAGS = -Wl,--enable-new-dtags -Wl,-rpath,$ORIGIN/../lib -L/usr/lib64 -Wl,--as-needed
23LDFLAGS_EX =
24LDFLAGS_SL =
25LIBS = -lpgcommon -lpgport -lpthread -lxerces-c -lbz2 -lxml2 -lpam -lrt -lssl -lcrypto -lgssapi_krb5 -luv -lz -lreadline -lrt -lcrypt -ldl -lm -lcurl -L/usr/lib -lzstd
26VERSION = PostgreSQL 12.12
27GP_VERSION = Greenplum 7.1.0buildcommit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515
28[gpadmin@gpdb7 ~]$
全部新特性原文
Release 7.1.0
Release Date: 2024-02-09
VMware Greenplum 7.1.0 is a minor release that includes new and changed features and resolves several issues.
New and Changed Features
VMware Greenplum 7.1.0 includes these new and changed features:
The
pgvector
module was updated to version 0.5.1. Refer to pgvector for module and upgrade information.
The
ip4r
module was updated to version 2.4.2. See ip4r.
VMware Greenplum 7.1.0 introduces the tablefunc module, which provides various examples of functions that return tables.
VMWware Greenplum includes a new extension -
pg_buffercache
-- which gives users access to five views to obtain clusterwide shared buffer metrics:
gp_buffercache
,
gp_buffercache_summary
,
gp_buffercache_usage_counts
,
gp_buffercache_summary_aggregated
, and
gp_buffercache_usage_counts_aggregated
.
VMware Greenplum 7.1.0 adds the gp_move_orphaned_files user-defined function (UDF) to the
gp_toolkit
administrative schema, which moves orphaned files found by the
gp_check_orphaned_files
view into a file system location that you specify.
The
gp_check_orphaned_files
view in the
gp_toolkit
schema contains a new column -
filepath
-- which prints relative/absolute path of the orphaned file.
Greenplum package utility, gppkg, introduces a new option to specify the name of the package to migrate to another minor version of VMware Greenplum, instead of migrating all packages.
The
gp_toolkit
administrative schema now includes some objects to aid in partition maintenance: a new view --
gp_partitions
, and several new user-defined functions, including:
pg_partition_rank()
,
pg_partition_range_from()
,
pg_partition_range_to()
,
pg_partition_bound_value()
,
pg_partition_isdefault()
,
pg_partition_lowest_child(),
and
pg_partition_highest_child()
. See The gp_toolkit Administrative Schema topic for details.
VMware Greenplum introduces a new utility --
pg_filedump
-- which allows you to read formatted content of VMware Greenplum data files, including table, index and control files.
Query optimization has been fine tuned to enhance performance for queries containing multiple DQA (Distinct Qualified Aggregate) and standard aggregates. This refinement leads to substantial IO savings, resulting in improved processing speed. This optimization may not be applicable for certain specialized queries, such as scenarios in which there are multiple columns from different DQA sources within a standard aggregate, or when filters are present within the DQA.
The new
gp_postmaster_address_family
server configuration parameter tells a node which type of IP address to use when initializing a cluster.
Greenplum's Data Science Package for Python now includes the
catboost
library, a high-performance open source library for gradient boosting on decision trees.
VMware Greenplum now supports differential segment recovery when using input configuration files (
gprecoverseg -i
). In addition, you may now prepend an
I
,
D
, or
F
to an entry in the recover_config_file you pass to
gprecoverseg -i
to indicate the type of segment recovery.
EXPLAIN ANALYZE
now shows buffer usage and I/O timings when using the
BUFFERS
keyword.
The
gpstate
utility now tracks data synchronization for a differential recovery with the
-e
option.
VMware Greenplum now supports the
TABLESAMPLE
clause for append-optimized tables, in addition to heap tables. Both
BERNOULLI
and
SYSTEM
sampling methods are now supported.
VMware Greenplum now supports the
SYSTEM_ROWS
and
SYSTEM_TIME
sampling methods for all tables, made available through the new
tsm_system_rows
and
tsm_system_time
modules, respectively.
The
gppkg
utility option
-f
now helps remove packages which have incomplete or missing files.
The PgBouncer connection pooler 1.21.0 is now distributed with VMware Greenplum 7.1.0, which includes support for encrypted LDAP passwords. Refer to Using the PgBouncer Connection Pooler for more details.
The new
gprecoverseg
option
max-rate
allows you to limit the maximum transfer bandwidth rate for a full segment recovery.
The
gpmovemirrors
utility has a new disk space check, so the utility will fail if the target host does not have enough space to accommodate the new mirrors.
Autovacuum now drops any orphaned temporary tables not dropped by the backends they were created on.
You may manually configure the location of your VMware Greenplum logs with the server configuration parameter log_directory. The
gpsupport
utility also supports collecting the logs from the directory set by this server configuration parameter.
The system view gp_stat_progress_dtx_recovery displays the progress of the Distributed Transaction (DTX) Recovery process, which may be useful to monitor the status of a coordinator recovery after a crash.
The new
gp_autotstats_lock_wait
server configuration parameter allows you to control whether
ANALYZE
commands triggered by automatic statistics collection will block if they cannot acquire the table lock.
The new
optimizer_enable_right_outer_join
server configuration parameter allows you to control whether GPORCA generates right outer joins. In situations in which you are observing poor performance related to right outer joins you may choose to suppress their use.
VMware Greenplum 7.1 now supports the VMware Greenplum Virtual Appliance. The virtual machine appliance contains everything you may need for an easy deploying of VMware Greenplum on vSphere. See VMware Greenplum on vSphere for more details.
The PostgresML extension now includes the
pgml.train
and
pgml.predict
functions for supervised learning.
You may configure one or more hosts outside your Greenplum cluster to use as a remote container host for your PL/Container workload, reducing the computing overload of the Greenplum hosts. See Configuring a Remote PL/Container for more details.
You can now use resource groups to manage and limit the total CPU resources for a PL/Container runtime. See PL/Container Resource Management for more details.
You can now download a VMware Greenplum 7 PL/Container image for R from VMware Tanzu Network.
The
VACUUM
command now includes the
SKIP_DATABASE_STATS
and
ONLY_DATABASE_STATS
clauses.
The output of the
pg_config
command now includes the Greenplum version.
參考
https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/relnotes-release-notes.html