Skip to content

API

pgdumplib

pgdumplib

pgdumplib exposes a load method to create a :py:class:~pgdumplib.dump.Dump instance from a :command:pg_dump file created in the custom format.

See the :doc:examples page to see how to read a dump or create one.

load(filepath, converter=None)

Load a pg_dump file created with -Fc from disk

:param filepath: The path to the dump to load :type filepath: str or pathlib.Path :param converter: The data converter class to use (Default: :py:class:pgdumplib.converters.DataConverter) :type converter: Converter class or None :raises: :py:exc:ValueError :rtype: pgdumplib.dump.Dump

Source code in pgdumplib/__init__.py
def load(
    filepath: str | pathlib.Path,
    converter: typing.Any = None,
) -> 'dump.Dump':
    """Load a pg_dump file created with -Fc from disk

    :param filepath: The path to the dump to load
    :type filepath: str or pathlib.Path
    :param converter: The data converter class to use
        (Default: :py:class:`pgdumplib.converters.DataConverter`)
    :type converter: Converter class or None
    :raises: :py:exc:`ValueError`
    :rtype: pgdumplib.dump.Dump

    """
    from pgdumplib import dump

    return dump.Dump(converter=converter).load(filepath)

new(dbname='pgdumplib', encoding='UTF8', converter=None, appear_as='18.0')

Create a new :py:class:pgdumplib.dump.Dump instance

:param dbname: The database name for the dump (Default: pgdumplib) :param encoding: The data encoding (Default: UTF8) :param converter: The data converter class to use (Default: :py:class:pgdumplib.converters.DataConverter) :type converter: Converter class or None :param appear_as: The version of Postgres to emulate (Default: 18.0) :rtype: pgdumplib.dump.Dump

Source code in pgdumplib/__init__.py
def new(
    dbname: str = 'pgdumplib',
    encoding: str = 'UTF8',
    converter: typing.Any = None,
    appear_as: str = '18.0',
) -> 'dump.Dump':
    """Create a new :py:class:`pgdumplib.dump.Dump` instance

    :param dbname: The database name for the dump (Default: ``pgdumplib``)
    :param encoding: The data encoding (Default: ``UTF8``)
    :param converter: The data converter class to use
        (Default: :py:class:`pgdumplib.converters.DataConverter`)
    :type converter: Converter class or None
    :param appear_as: The version of Postgres to emulate
        (Default: ``18.0``)
    :rtype: pgdumplib.dump.Dump

    """
    from pgdumplib import dump

    return dump.Dump(dbname, encoding, converter, appear_as)

pgdumplib.dump

pgdumplib.dump

The :py:class:~pgdumplib.dump.Dump class exposes methods to :py:meth:load <pgdumplib.dump.Dump.load> an existing dump, to :py:meth:add an entry <pgdumplib.dump.Dump.add_entry> to a dump, to :py:meth:add table data <pgdumplib.dump.Dump.add_data> to a dump, to :py:meth:add blob data <pgdumplib.dump.Dump.add_blob> to a dump, and to :py:meth:save <pgdumplib.dump.Dump.save> a new dump.

There are :doc:converters that are available to format the data that is returned by :py:meth:~pgdumplib.dump.Dump.read_data. The converter is passed in during construction of a new :py:class:~pgdumplib.dump.Dump, and is also available as an argument to :py:func:pgdumplib.load.

The default converter, :py:class:~pgdumplib.converters.DataConverter will return all fields as strings, only replacing NULL with :py:const:None. The :py:class:~pgdumplib.converters.SmartDataConverter will attempt to convert all columns to native Python data types.

When loading or creating a dump, the table and blob data are stored in gzip compressed data files in a temporary directory that is automatically cleaned up when the :py:class:~pgdumplib.dump.Dump instance is released.

TableData

Used to encapsulate table data using temporary file and allowing for an API that allows for the appending of data one row at a time.

Do not create this class directly, instead invoke :py:meth:~pgdumplib.dump.Dump.table_data_writer.

Source code in pgdumplib/dump.py
class TableData:
    """Used to encapsulate table data using temporary file and allowing
    for an API that allows for the appending of data one row at a time.

    Do not create this class directly, instead invoke
    :py:meth:`~pgdumplib.dump.Dump.table_data_writer`.

    """

    def __init__(self, dump_id: int, tempdir: str, encoding: str):
        self.dump_id = dump_id
        self._encoding = encoding
        self._path = pathlib.Path(tempdir) / f'{dump_id}.gz'
        self._handle = gzip.open(self._path, 'wb')

    def append(self, *args) -> None:
        """Append a row to the table data, passing columns in as args

        Column order must match the order specified when
        :py:meth:`~pgdumplib.dump.Dump.table_data_writer` was invoked.

        All columns will be coerced to a string with special attention
        paid to ``None``, converting it to the null marker (``\\N``) and
        :py:class:`datetime.datetime` objects, which will have the proper
        pg_dump timestamp format applied to them.

        """
        row = '\t'.join([self._convert(c) for c in args])
        self._handle.write(f'{row}\n'.encode(self._encoding))

    def finish(self) -> None:
        """Invoked prior to saving a dump to close the temporary data
        handle and switch the class into read-only mode.

        For use by :py:class:`pgdumplib.dump.Dump` only.

        """
        if not self._handle.closed:
            self._handle.close()
        self._handle = gzip.open(self._path, 'rb')

    def read(self) -> bytes:
        """Read the data from disk for writing to the dump

        For use by :py:class:`pgdumplib.dump.Dump` only.

        """
        self._handle.seek(0)
        return self._handle.read()

    @property
    def size(self) -> int:
        """Return the current size of the data on disk"""
        self._handle.seek(0, io.SEEK_END)  # Seek to end to figure out size
        size = self._handle.tell()
        self._handle.seek(0)
        return size

    @staticmethod
    def _convert(column: typing.Any) -> str:
        """Convert the column to a string

        :param column: The column to convert

        """
        if isinstance(column, datetime.datetime):
            return column.strftime(constants.PGDUMP_STRFTIME_FMT)
        elif column is None:
            return '\\N'
        return str(column)

size property

Return the current size of the data on disk

append(*args)

Append a row to the table data, passing columns in as args

Column order must match the order specified when :py:meth:~pgdumplib.dump.Dump.table_data_writer was invoked.

All columns will be coerced to a string with special attention paid to None, converting it to the null marker (\N) and :py:class:datetime.datetime objects, which will have the proper pg_dump timestamp format applied to them.

Source code in pgdumplib/dump.py
def append(self, *args) -> None:
    """Append a row to the table data, passing columns in as args

    Column order must match the order specified when
    :py:meth:`~pgdumplib.dump.Dump.table_data_writer` was invoked.

    All columns will be coerced to a string with special attention
    paid to ``None``, converting it to the null marker (``\\N``) and
    :py:class:`datetime.datetime` objects, which will have the proper
    pg_dump timestamp format applied to them.

    """
    row = '\t'.join([self._convert(c) for c in args])
    self._handle.write(f'{row}\n'.encode(self._encoding))

finish()

Invoked prior to saving a dump to close the temporary data handle and switch the class into read-only mode.

For use by :py:class:pgdumplib.dump.Dump only.

Source code in pgdumplib/dump.py
def finish(self) -> None:
    """Invoked prior to saving a dump to close the temporary data
    handle and switch the class into read-only mode.

    For use by :py:class:`pgdumplib.dump.Dump` only.

    """
    if not self._handle.closed:
        self._handle.close()
    self._handle = gzip.open(self._path, 'rb')

read()

Read the data from disk for writing to the dump

For use by :py:class:pgdumplib.dump.Dump only.

Source code in pgdumplib/dump.py
def read(self) -> bytes:
    """Read the data from disk for writing to the dump

    For use by :py:class:`pgdumplib.dump.Dump` only.

    """
    self._handle.seek(0)
    return self._handle.read()

Dump

Create a new instance of the :py:class:~pgdumplib.dump.Dump class

Once created, the instance of :py:class:~pgdumplib.dump.Dump can be used to read existing dumps or to create new ones.

:param str dbname: The database name for the dump (Default: pgdumplib) :param str encoding: The data encoding (Default: UTF8) :param converter: The data converter class to use (Default: :py:class:pgdumplib.converters.DataConverter)

Source code in pgdumplib/dump.py
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
class Dump:
    """Create a new instance of the :py:class:`~pgdumplib.dump.Dump` class

    Once created, the instance of :py:class:`~pgdumplib.dump.Dump` can
    be used to read existing dumps or to create new ones.

    :param str dbname: The database name for the dump (Default: ``pgdumplib``)
    :param str encoding: The data encoding (Default: ``UTF8``)
    :param converter: The data converter class to use
        (Default: :py:class:`pgdumplib.converters.DataConverter`)

    """

    def __init__(
        self,
        dbname: str = 'pgdumplib',
        encoding: str = 'UTF8',
        converter: typing.Any = None,
        appear_as: str = '12.0',
    ):
        self.compression_algorithm = constants.COMPRESSION_NONE
        self.dbname = dbname
        self.dump_version = VERSION_INFO.format(appear_as, version)
        self.encoding = encoding
        self.entries = [
            models.Entry(
                dump_id=1,
                tag=constants.ENCODING,
                desc=constants.ENCODING,
                defn=f"SET client_encoding = '{self.encoding}';\n",
            ),
            models.Entry(
                dump_id=2,
                tag='STDSTRINGS',
                desc='STDSTRINGS',
                defn="SET standard_conforming_strings = 'on';\n",
            ),
            models.Entry(
                dump_id=3,
                tag='SEARCHPATH',
                desc='SEARCHPATH',
                defn='SELECT pg_catalog.set_config('
                "'search_path', '', false);\n",
            ),
        ]
        self.server_version = self.dump_version
        self.timestamp = datetime.datetime.now(tz=datetime.UTC)

        converter = converter or converters.DataConverter
        self._converter: converters.DataConverter = converter()
        self._format: str = 'Custom'
        self._handle: io.BufferedReader | io.BufferedWriter | None = None
        self._intsize: int = 4
        self._offsize: int = 8
        self._temp_dir = tempfile.TemporaryDirectory()
        parts = tuple(int(v) for v in appear_as.split('.'))
        if len(parts) < 2:
            raise ValueError(f'Invalid appear_as version: {appear_as}')
        k_version = self._get_k_version(parts)
        self._vmaj: int = k_version[0]
        self._vmin: int = k_version[1]
        self._vrev: int = k_version[2]
        self._writers: dict[int, TableData] = {}

    def __repr__(self) -> str:
        return (
            f'<Dump format={self._format!r} '
            f'timestamp={self.timestamp.isoformat()!r} '
            f'entry_count={len(self.entries)!r}>'
        )

    def add_entry(
        self,
        desc: str,
        namespace: str | None = None,
        tag: str | None = None,
        owner: str | None = None,
        defn: str | None = None,
        drop_stmt: str | None = None,
        copy_stmt: str | None = None,
        dependencies: list[int] | None = None,
        tablespace: str | None = None,
        tableam: str | None = None,
        dump_id: int | None = None,
    ) -> models.Entry:
        """Add an entry to the dump

        A :py:exc:`ValueError` will be raised if `desc` is not value that
        is known in :py:module:`pgdumplib.constants`.

        The section is

        When adding data, use :py:meth:`~Dump.table_data_writer` instead of
        invoking :py:meth:`~Dump.add_entry` directly.

        If ``dependencies`` are specified, they will be validated and if a
        ``dump_id`` is specified and no entry is found with that ``dump_id``,
        a :py:exc:`ValueError` will be raised.

        Other omitted values will be set to the default values will be set to
        the defaults specified in the :py:class:`pgdumplib.dump.Entry`
        class.

        The ``dump_id`` will be auto-calculated based upon the existing entries
        if it is not specified.

        .. note:: The creation of ad-hoc blobs is not supported.

        :param str desc: The entry description
        :param str namespace: The namespace of the entry
        :param str tag: The name/table/relation/etc of the entry
        :param str owner: The owner of the object in Postgres
        :param str defn: The DDL definition for the entry
        :param drop_stmt: A drop statement used to drop the entry before
        :param copy_stmt: A copy statement used when there is a corresponding
            data section.
        :param list dependencies: A list of dump_ids of objects that the entry
            is dependent upon.
        :param str tablespace: The tablespace to use
        :param str tableam: The table access method
        :param int dump_id: The dump id, will be auto-calculated if left empty
        :raises: :py:exc:`ValueError`
        :rtype: pgdumplib.dump.Entry

        """
        if desc not in constants.SECTION_MAPPING:
            raise ValueError(f'Invalid desc: {desc}')

        if dump_id is not None and dump_id < 1:
            raise ValueError('dump_id must be greater than 1')

        dump_ids = [e.dump_id for e in self.entries]

        if dump_id and dump_id in dump_ids:
            raise ValueError('dump_id {!r} is already assigned', dump_id)

        for dependency in dependencies or []:
            if dependency not in dump_ids:
                raise ValueError(
                    f'Dependency dump_id {dependency!r} not found'
                )
        self.entries.append(
            models.Entry(
                dump_id=dump_id or self._next_dump_id(),
                had_dumper=False,
                table_oid='',
                oid='',
                tag=tag or '',
                desc=desc,
                defn=defn or '',
                drop_stmt=drop_stmt or '',
                copy_stmt=copy_stmt or '',
                namespace=namespace or '',
                tablespace=tablespace or None,
                tableam=tableam or None,
                relkind=None,
                owner=owner or '',
                with_oids=False,
                dependencies=dependencies or [],
            )
        )
        return self.entries[-1]

    def blobs(self) -> typing.Generator[tuple[int, bytes], None, None]:
        """Iterator that returns each blob in the dump

        :rtype: tuple(int, bytes)

        """

        def read_oid(fd: io.BufferedReader) -> int | None:
            """Small helper function to deduplicate code"""
            try:
                return struct.unpack('I', fd.read(4))[0]
            except struct.error:
                return None

        for entry in self._data_entries:
            if entry.desc == constants.BLOBS:
                with self._tempfile(entry.dump_id, 'rb') as handle:
                    oid: int | None = read_oid(handle)
                    while oid:
                        length: int = struct.unpack('I', handle.read(4))[0]
                        yield oid, handle.read(length)
                        oid = read_oid(handle)

    def get_entry(self, dump_id: int) -> models.Entry | None:
        """Return the entry for the given `dump_id`

        :param int dump_id: The dump ID of the entry to return.

        """
        for entry in self.entries:
            if entry.dump_id == dump_id:
                return entry
        return None

    def load(self, path: str | os.PathLike) -> typing.Self:
        """Load the Dumpfile, including extracting all data into a temporary
        directory

        :param os.PathLike path: The path of the dump to load
        :raises: :py:exc:`RuntimeError`
        :raises: :py:exc:`ValueError`

        """
        if not pathlib.Path(path).exists():
            raise ValueError(f'Path {path!r} does not exist')

        LOGGER.debug('Loading dump file from %s', path)

        self.entries = []  # Wipe out pre-existing entries
        self._handle = open(path, 'rb')
        self._read_header()
        if not constants.MIN_VER <= self.version <= constants.MAX_VER:
            raise ValueError(
                'Unsupported backup version: {}.{}.{}'.format(*self.version)
            )

        if self.version >= (1, 15, 0):
            self.compression_algorithm = constants.COMPRESSION_ALGORITHMS[
                self._compression_algorithm
            ]

            if (
                self.compression_algorithm
                not in constants.SUPPORTED_COMPRESSION_ALGORITHMS
            ):
                raise ValueError(
                    'Unsupported compression algorithm: {}'.format(
                        *self.compression_algorithm
                    )
                )
        else:
            self.compression_algorithm = (
                constants.COMPRESSION_GZIP
                if self._read_int() != 0
                else constants.COMPRESSION_NONE
            )

        self.timestamp = self._read_timestamp()
        self.dbname = self._read_bytes().decode(self.encoding)
        self.server_version = self._read_bytes().decode(self.encoding)
        self.dump_version = self._read_bytes().decode(self.encoding)

        self._read_entries()
        self._set_encoding()

        # Cache table data and blobs
        _last_pos = self._handle.tell()

        for entry in self._data_entries:
            if entry.data_state == constants.K_OFFSET_NO_DATA:
                continue
            elif entry.data_state != constants.K_OFFSET_POS_SET:
                raise RuntimeError('Unsupported data format')
            self._handle.seek(entry.offset, io.SEEK_SET)
            block_type, dump_id = self._read_block_header()
            if not dump_id or dump_id != entry.dump_id:
                raise RuntimeError(
                    f'Dump IDs do not match ({dump_id} != {entry.dump_id}'
                )
            if block_type == constants.BLK_DATA:
                self._cache_table_data(dump_id)
            elif block_type == constants.BLK_BLOBS:
                self._cache_blobs(dump_id)
            else:
                raise RuntimeError(f'Unknown block type: {block_type!r}')
        return self

    def lookup_entry(
        self, desc: str, namespace: str, tag: str
    ) -> models.Entry | None:
        """Return the entry for the given namespace and tag

        :param str desc: The desc / object type of the entry
        :param str namespace: The namespace of the entry
        :param str tag: The tag/relation/table name
        :raises: :py:exc:`ValueError`
        :rtype: pgdumplib.dump.Entry or None

        """
        if desc not in constants.SECTION_MAPPING:
            raise ValueError(f'Invalid desc: {desc}')
        for entry in [e for e in self.entries if e.desc == desc]:
            if entry.namespace == namespace and entry.tag == tag:
                return entry
        return None

    def save(self, path: str | os.PathLike) -> None:
        """Save the Dump file to the specified path

        :param path: The path to save the dump to
        :type path: str or os.PathLike

        """
        if self._handle is not None and not self._handle.closed:
            self._handle.close()
        self.compression_algorithm = constants.COMPRESSION_NONE
        self._handle = open(path, 'wb')
        self._save()
        self._handle.close()

    def table_data(
        self, namespace: str, table: str
    ) -> typing.Generator[str | tuple[typing.Any, ...], None, None]:
        """Iterator that returns data for the given namespace and table

        :param str namespace: The namespace/schema for the table
        :param str table: The table name
        :raises: :py:exc:`pgdumplib.exceptions.EntityNotFoundError`

        """
        for entry in self._data_entries:
            if entry.namespace == namespace and entry.tag == table:
                for row in self._read_table_data(entry.dump_id):
                    yield self._converter.convert(row)
                return
        raise exceptions.EntityNotFoundError(namespace=namespace, table=table)

    @contextlib.contextmanager
    def table_data_writer(
        self, entry: models.Entry, columns: abc.Sequence
    ) -> typing.Generator[TableData, None, None]:
        """A context manager that is used to return a
        :py:class:`~pgdumplib.dump.TableData` instance, which can be used
        to add table data to the dump.

        When invoked for a given entry containing the table definition,

        :param Entry entry: The entry for the table to add data for
        :param columns: The ordered list of table columns
        :type columns: list or tuple
        :rtype: TableData

        """
        if entry.dump_id not in self._writers.keys():
            dump_id = self._next_dump_id()
            self.entries.append(
                models.Entry(
                    dump_id=dump_id,
                    had_dumper=True,
                    tag=entry.tag,
                    desc=constants.TABLE_DATA,
                    copy_stmt='COPY {}.{} ({}) FROM stdin;'.format(
                        entry.namespace, entry.tag, ', '.join(columns)
                    ),
                    namespace=entry.namespace,
                    owner=entry.owner,
                    dependencies=[entry.dump_id],
                    data_state=constants.K_OFFSET_POS_NOT_SET,
                )
            )
            self._writers[entry.dump_id] = TableData(
                dump_id, self._temp_dir.name, self.encoding
            )
        yield self._writers[entry.dump_id]
        return None

    @property
    def version(self) -> tuple[int, int, int]:
        """Return the version as a tuple to make version comparisons easier.

        :rtype: tuple

        """
        return self._vmaj, self._vmin, self._vrev

    def _cache_blobs(self, dump_id: int) -> None:
        """Create a temp cache file for blob data

        :param int dump_id: The dump ID for the filename

        """
        count = 0
        with self._tempfile(dump_id, 'wb') as handle:
            for oid, blob in self._read_blobs():
                handle.write(struct.pack('I', oid))
                handle.write(struct.pack('I', len(blob)))
                handle.write(blob)
                count += 1

    def _cache_table_data(self, dump_id: int) -> None:
        """Create a temp cache file for the table data

        :param int dump_id: The dump ID for the filename

        """
        with self._tempfile(dump_id, 'wb') as handle:
            handle.write(self._read_data())

    @property
    def _data_entries(self) -> list[models.Entry]:
        """Return the list of entries that are in the data section

        :rtype: list

        """
        return [e for e in self.entries if e.section == constants.SECTION_DATA]

    @staticmethod
    def _get_k_version(appear_as: tuple[int, ...]) -> tuple[int, int, int]:
        for (min_ver, max_ver), value in constants.K_VERSION_MAP.items():
            if min_ver <= appear_as <= max_ver:
                return value
        raise RuntimeError(f'Unsupported PostgreSQL version: {appear_as}')

    def _next_dump_id(self) -> int:
        """Get the next ``dump_id`` that is available for adding an entry

        :rtype: int

        """
        return max(e.dump_id for e in self.entries) + 1

    def _read_blobs(self) -> typing.Generator[tuple[int, bytes], None, None]:
        """Read blobs, returning a tuple of the blob ID and the blob data

        :rtype: (int, bytes)
        :raises: :exc:`RuntimeError`

        """
        oid = self._read_int()
        while oid is not None and oid > 0:
            data = self._read_data()
            yield oid, data
            oid = self._read_int()
            if oid == 0:
                oid = self._read_int()

    def _read_block_header(self) -> tuple[bytes, int | None]:
        """Read the block header in

        :rtype: bytes, int

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        return self._handle.read(1), self._read_int()

    def _read_byte(self) -> int | None:
        """Read in an individual byte

        :rtype: int

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        try:
            return struct.unpack('B', self._handle.read(1))[0]
        except struct.error:
            return None

    def _read_bytes(self) -> bytes:
        """Read in a byte stream

        :rtype: bytes

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        length = self._read_int()
        if length and length > 0:
            value = self._handle.read(length)
            return value
        return b''

    def _read_data(self) -> bytes:
        """Read a data block, returning the bytes.

        :rtype: bytes

        """
        if self.compression_algorithm != constants.COMPRESSION_NONE:
            return self._read_data_compressed()
        return self._read_data_uncompressed()

    def _read_data_compressed(self) -> bytes:
        """Read a compressed data block

        :rtype: bytes

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        buffer = io.BytesIO()
        chunk = b''
        decompress = zlib.decompressobj()
        while True:
            chunk_size = self._read_int()
            if not chunk_size:  # pragma: nocover
                break
            chunk += self._handle.read(chunk_size)
            buffer.write(decompress.decompress(chunk))
            chunk = decompress.unconsumed_tail
            if chunk_size < constants.ZLIB_IN_SIZE:
                break
        return buffer.getvalue()

    def _read_data_uncompressed(self) -> bytes:
        """Read an uncompressed data block

        :rtype: bytes

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        buffer = io.BytesIO()
        while True:
            block_length = self._read_int()
            if not block_length or block_length <= 0:
                break
            buffer.write(self._handle.read(block_length))
        return buffer.getvalue()

    def _read_dependencies(self) -> list[int]:
        """Read in the dependencies for an entry.

        :rtype: list

        """
        values = set({})
        while True:
            value = self._read_bytes()
            if not value:
                break
            values.add(int(value))
        return sorted(values)

    def _read_entries(self) -> None:
        """Read in all of the entries"""
        for _i in range(0, self._read_int() or 0):
            self._read_entry()

    def _read_entry(self) -> None:
        """Read in an individual entry and append it to the entries stack"""
        dump_id = self._read_int()
        if dump_id is None:
            raise ValueError('dump_id cannot be None')
        had_dumper = bool(self._read_int())
        table_oid = self._read_bytes().decode(self.encoding)
        oid = self._read_bytes().decode(self.encoding)
        tag = self._read_bytes().decode(self.encoding)
        desc = self._read_bytes().decode(self.encoding)
        self._read_int()  # Section is mapped, no need to assign
        defn = self._read_bytes().decode(self.encoding)
        drop_stmt = self._read_bytes().decode(self.encoding)
        copy_stmt = self._read_bytes().decode(self.encoding)
        namespace = self._read_bytes().decode(self.encoding)
        tablespace = self._read_bytes().decode(self.encoding)
        # Normalize empty strings to None for consistency
        tablespace = tablespace if tablespace else None
        if self.version >= (1, 14, 0):
            tableam = self._read_bytes().decode(self.encoding)
            # Normalize empty strings to None to prevent invalid SQL
            # generation (e.g., SET default_table_access_method = "";)
            tableam = tableam if tableam else None
        else:
            tableam = None
        if self.version >= (1, 16, 0):
            relkind_val = self._read_int()
            relkind = chr(relkind_val) if relkind_val else None
        else:
            relkind = None
        owner = self._read_bytes().decode(self.encoding)
        with_oids = self._read_bytes() == b'true'
        dependencies = self._read_dependencies()
        data_state, offset = self._read_offset()
        self.entries.append(
            models.Entry(
                dump_id=dump_id,
                had_dumper=had_dumper,
                table_oid=table_oid,
                oid=oid,
                tag=tag,
                desc=desc,
                defn=defn,
                drop_stmt=drop_stmt,
                copy_stmt=copy_stmt,
                namespace=namespace,
                tablespace=tablespace,
                tableam=tableam,
                relkind=relkind,
                owner=owner,
                with_oids=with_oids,
                dependencies=dependencies,
                data_state=data_state or 0,
                offset=offset or 0,
            )
        )

    def _read_header(self) -> None:
        """Read in the dump header

        :raises: ValueError

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        magic_bytes = self._handle.read(5)
        if magic_bytes != constants.MAGIC:
            # Provide helpful error messages based on file content
            error_msg = (
                'Invalid archive header. '
                'pgdumplib only supports custom format dumps '
                'created with pg_dump -Fc'
            )
            try:
                # Try to detect plain SQL files
                file_start = magic_bytes.decode('ascii', errors='ignore')
                if file_start.startswith(('--', '/*', 'SE', 'CR', 'IN', 'DR')):
                    error_msg = (
                        'This appears to be a plain SQL text file. '
                        'pgdumplib only supports custom format dumps '
                        'created with pg_dump -Fc'
                    )
                elif len(file_start) == 0 or not file_start.isprintable():
                    error_msg = (
                        'Invalid archive format. '
                        'pgdumplib only supports custom format dumps '
                        'created with pg_dump -Fc'
                    )
            except (UnicodeDecodeError, AttributeError):
                # Ignore errors from decode or isprintable on invalid data
                pass
            raise ValueError(error_msg)
        self._vmaj = struct.unpack('B', self._handle.read(1))[0]
        self._vmin = struct.unpack('B', self._handle.read(1))[0]
        self._vrev = struct.unpack('B', self._handle.read(1))[0]
        self._intsize = struct.unpack('B', self._handle.read(1))[0]
        self._offsize = struct.unpack('B', self._handle.read(1))[0]
        self._format = constants.FORMATS[
            struct.unpack('B', self._handle.read(1))[0]
        ]
        LOGGER.debug(
            'Archive version %i.%i.%i', self._vmaj, self._vmin, self._vrev
        )
        # v1.15+ has compression_spec.algorithm byte
        if (self._vmaj, self._vmin, self._vrev) >= (1, 15, 0):
            self._compression_algorithm = struct.unpack(
                'B', self._handle.read(1)
            )[0]

    def _read_int(self) -> int | None:
        """Read in a signed integer

        :rtype: int or None

        """
        sign = self._read_byte()
        if sign is None:
            return None
        bs, bv, value = 0, 0, 0
        for _offset in range(0, self._intsize):
            bv = (self._read_byte() or 0) & 0xFF
            if bv != 0:
                value += bv << bs
            bs += 8
        return -value if sign else value

    def _read_offset(self) -> tuple[int, int]:
        """Read in the value for the length of the data stored in the file

        :rtype: int, int

        """
        data_state = self._read_byte() or 0
        value = 0
        for offset in range(0, self._offsize):
            bv = self._read_byte() or 0
            value |= bv << (offset * 8)
        return data_state, value

    def _read_table_data(
        self, dump_id: int
    ) -> typing.Generator[str, None, None]:
        """Iterate through the data returning on row at a time

        :rtype: str

        """
        try:
            with self._tempfile(dump_id, 'rb') as handle:
                for line in handle:
                    out = (line or b'').decode(self.encoding).strip()
                    if out.startswith('\\.') or not out:
                        break
                    yield out
        except exceptions.NoDataError:
            pass

    def _read_timestamp(self) -> datetime.datetime:
        """Read in the timestamp from handle.

        :rtype: datetime.datetime

        """
        second, minute, hour, day, month, year = (
            self._read_int() or 0,
            self._read_int() or 0,
            self._read_int() or 0,
            self._read_int() or 0,
            (self._read_int() or 0) + 1,
            (self._read_int() or 0) + 1900,
        )
        self._read_int()  # DST flag
        return datetime.datetime(
            year, month, day, hour, minute, second, 0, tzinfo=datetime.UTC
        )

    def _save(self) -> None:
        """Save the dump file to disk"""
        self._write_toc()
        self._write_entries()
        if self._write_data():
            self._write_toc()  # Overwrite ToC and entries
            self._write_entries()

    def _set_encoding(self) -> None:
        """If the encoding is found in the dump entries, set the encoding
        to `self.encoding`.

        """
        for entry in self.entries:
            if entry.desc == constants.ENCODING and entry.defn:
                match = ENCODING_PATTERN.match(entry.defn)
                if match:
                    self.encoding = match.group(1)
                    return

    @contextlib.contextmanager
    def _tempfile(
        self, dump_id: int, mode: str
    ) -> typing.Generator[typing.Any, None, None]:
        """Open the temp file for the specified dump_id in the specified mode

        :param int dump_id: The dump_id for the temp file
        :param str mode: The mode (rb, wb)

        """
        path = pathlib.Path(self._temp_dir.name) / f'{dump_id}.gz'
        if not path.exists() and mode.startswith('r'):
            raise exceptions.NoDataError()
        with gzip.open(path, mode) as handle:
            try:
                yield handle
            except Exception:
                raise

    def _write_blobs(self, dump_id: int) -> int:
        """Write the blobs for the entry.

        :param int dump_id: The entry dump ID for the blobs
        :rtype: int

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        length = 0
        with self._tempfile(dump_id, 'rb') as handle:
            self._handle.write(constants.BLK_BLOBS)
            self._write_int(dump_id)
            while True:
                try:
                    oid = struct.unpack('I', handle.read(4))[0]
                except struct.error:
                    break
                length = struct.unpack('I', handle.read(4))[0]
                self._write_int(oid)
                self._write_int(length)
                self._handle.write(handle.read(length))
                self._write_int(0)
            self._write_int(0)
        return length

    def _write_byte(self, value: int) -> None:
        """Write a byte to the handle

        :param int value: The byte value

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        self._handle.write(struct.pack('B', value))

    def _write_data(self) -> set[int]:
        """Write the data blocks, returning a set of IDs that were written"""
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        saved = set({})
        for offset, entry in enumerate(self.entries):
            if entry.section != constants.SECTION_DATA:
                continue
            self.entries[offset].offset = self._handle.tell()
            size = 0
            if entry.desc == constants.TABLE_DATA:
                size = self._write_table_data(entry.dump_id)
                saved.add(entry.dump_id)
            elif entry.desc == constants.BLOBS:
                size = self._write_blobs(entry.dump_id)
                saved.add(entry.dump_id)
            if size:
                self.entries[offset].data_state = constants.K_OFFSET_POS_SET
        return saved

    def _write_entries(self) -> None:
        self._write_int(len(self.entries))
        saved = set({})

        # Always add these entries first
        for entry in self.entries[0:3]:
            self._write_entry(entry)
            saved.add(entry.dump_id)

        saved = self._write_section(
            constants.SECTION_PRE_DATA,
            [
                constants.GROUP,
                constants.ROLE,
                constants.USER,
                constants.SCHEMA,
                constants.EXTENSION,
                constants.AGGREGATE,
                constants.OPERATOR,
                constants.OPERATOR_CLASS,
                constants.CAST,
                constants.COLLATION,
                constants.CONVERSION,
                constants.PROCEDURAL_LANGUAGE,
                constants.FOREIGN_DATA_WRAPPER,
                constants.FOREIGN_SERVER,
                constants.SERVER,
                constants.DOMAIN,
                constants.TYPE,
                constants.SHELL_TYPE,
            ],
            saved,
        )

        saved = self._write_section(constants.SECTION_DATA, [], saved)

        saved = self._write_section(
            constants.SECTION_POST_DATA,
            [
                constants.CHECK_CONSTRAINT,
                constants.CONSTRAINT,
                constants.INDEX,
            ],
            saved,
        )

        saved = self._write_section(constants.SECTION_NONE, [], saved)
        LOGGER.debug('Wrote %i of %i entries', len(saved), len(self.entries))

    def _write_entry(self, entry: models.Entry) -> None:
        """Write the entry

        :param pgdumplib.dump.Entry entry: The entry to write

        """
        LOGGER.debug('Writing %r', entry)
        self._write_int(entry.dump_id)
        self._write_int(int(entry.had_dumper))
        self._write_str(entry.table_oid or '0')
        self._write_str(entry.oid or '0')
        self._write_str(entry.tag)
        self._write_str(entry.desc)
        self._write_int(constants.SECTIONS.index(entry.section) + 1)
        self._write_str(entry.defn)
        self._write_str(entry.drop_stmt)
        self._write_str(entry.copy_stmt)
        self._write_str(entry.namespace)
        self._write_str(entry.tablespace)
        if self.version >= (1, 14, 0):
            LOGGER.debug('Adding tableam')
            self._write_str(entry.tableam)
        if self.version >= (1, 16, 0):
            LOGGER.debug('Adding relkind')
            # Write relkind as an int (character code)
            relkind_val = ord(entry.relkind) if entry.relkind else 0
            self._write_int(relkind_val)
        self._write_str(entry.owner)
        self._write_str('true' if entry.with_oids else 'false')
        for dependency in entry.dependencies or []:
            self._write_str(str(dependency))
        self._write_int(-1)
        self._write_offset(entry.offset, entry.data_state)

    def _write_header(self) -> None:
        """Write the file header"""
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        LOGGER.debug(
            'Writing archive version %i.%i.%i',
            self._vmaj,
            self._vmin,
            self._vrev,
        )
        self._handle.write(constants.MAGIC)
        self._write_byte(self._vmaj)
        self._write_byte(self._vmin)
        self._write_byte(self._vrev)
        self._write_byte(self._intsize)
        self._write_byte(self._offsize)
        self._write_byte(constants.FORMATS.index(self._format))
        # v1.15+ has compression algorithm in header
        if self.version >= (1, 15, 0):
            # Write compression algorithm: 0=none, 1=gzip, 2=lz4, 3=zstd
            comp_alg = constants.COMPRESSION_ALGORITHMS.index(
                self.compression_algorithm
            )
            self._write_byte(comp_alg)

    def _write_int(self, value: int) -> None:
        """Write an integer value

        :param int value:

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        self._write_byte(1 if value < 0 else 0)
        if value < 0:
            value = -value
        for _offset in range(0, self._intsize):
            self._write_byte(value & 0xFF)
            value >>= 8

    def _write_offset(self, value: int, data_state: int) -> None:
        """Write the offset value.

        :param int value: The value to write
        :param int data_state: The data state flag

        """
        self._write_byte(data_state)
        for _offset in range(0, self._offsize):
            self._write_byte(value & 0xFF)
            value >>= 8

    def _write_section(
        self, section: str, obj_types: list[str], saved: set[int]
    ) -> set[int]:
        for obj_type in obj_types:
            for entry in [e for e in self.entries if e.desc == obj_type]:
                self._write_entry(entry)
                saved.add(entry.dump_id)
        for dump_id in toposort.toposort_flatten(
            {
                e.dump_id: set(e.dependencies)
                for e in self.entries
                if e.section == section
            },
            True,
        ):
            if dump_id not in saved:
                found_entry: models.Entry | None = self.get_entry(dump_id)
                if found_entry:
                    self._write_entry(found_entry)
                    saved.add(dump_id)
                else:
                    LOGGER.warning('Entry %d not found, skipping', dump_id)
        return saved

    def _write_str(self, value: str | None) -> None:
        """Write a string or NULL marker

        :param value: The string to write, or None to write -1 length
            (indicating an unset/NULL field in the archive format)

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        if value is None:
            # Write -1 length to indicate "not set" rather than "empty string"
            self._write_int(-1)
        else:
            out = value.encode(self.encoding)
            self._write_int(len(out))
            if out:
                LOGGER.debug('Writing %r', out)
                self._handle.write(out)

    def _write_table_data(self, dump_id: int) -> int:
        """Write the blobs for the entry, returning the # of bytes written

        :param int dump_id: The entry dump ID for the blobs
        :rtype: int

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        self._handle.write(constants.BLK_DATA)
        self._write_int(dump_id)

        writer = [w for w in self._writers.values() if w.dump_id == dump_id]
        if writer:  # Data was added ad-hoc, read from TableData writer
            writer[0].finish()
            # writer.read() returns decompressed data (auto-decompressed)
            data = writer[0].read()

            if self.compression_algorithm != constants.COMPRESSION_NONE:
                # Re-compress with zlib and write in chunks
                # Compress all data as a continuous stream
                compressed_data = zlib.compress(data)

                # Write compressed data in ZLIB_IN_SIZE chunks
                total_size = 0
                offset = 0
                while offset < len(compressed_data):
                    chunk_size = min(
                        constants.ZLIB_IN_SIZE, len(compressed_data) - offset
                    )
                    self._write_int(chunk_size)
                    self._handle.write(
                        compressed_data[offset : offset + chunk_size]
                    )
                    total_size += chunk_size
                    offset += chunk_size
            else:
                # Write uncompressed in chunks
                total_size = 0
                offset = 0
                while offset < len(data):
                    chunk_size = min(
                        constants.ZLIB_IN_SIZE, len(data) - offset
                    )
                    self._write_int(chunk_size)
                    self._handle.write(data[offset : offset + chunk_size])
                    total_size += chunk_size
                    offset += chunk_size
            self._write_int(0)  # End of data indicator
            return total_size

        # Data was cached on load - read from tempfile and write
        with self._tempfile(dump_id, 'rb') as handle:
            # Read all decompressed data from the gzip temp file
            data = handle.read()

        if self.compression_algorithm != constants.COMPRESSION_NONE:
            # Compress and write in chunks
            # Compress all data as a continuous stream
            compressed_data = zlib.compress(data)

            # Write compressed data in ZLIB_IN_SIZE chunks
            total_size = 0
            offset = 0
            while offset < len(compressed_data):
                chunk_size = min(
                    constants.ZLIB_IN_SIZE, len(compressed_data) - offset
                )
                self._write_int(chunk_size)
                self._handle.write(
                    compressed_data[offset : offset + chunk_size]
                )
                total_size += chunk_size
                offset += chunk_size
        else:
            # Write uncompressed in chunks
            total_size = 0
            offset = 0
            while offset < len(data):
                chunk_size = min(constants.ZLIB_IN_SIZE, len(data) - offset)
                self._write_int(chunk_size)
                self._handle.write(data[offset : offset + chunk_size])
                total_size += chunk_size
                offset += chunk_size

        self._write_int(0)  # End of data indicator
        return total_size

    def _write_timestamp(self, value: datetime.datetime) -> None:
        """Write a datetime.datetime value

        :param datetime.datetime value: The value to write

        """
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        self._write_int(value.second)
        self._write_int(value.minute)
        self._write_int(value.hour)
        self._write_int(value.day)
        self._write_int(value.month - 1)
        self._write_int(value.year - 1900)
        self._write_int(1 if value.dst() else 0)

    def _write_toc(self) -> None:
        """Write the ToC for the file"""
        if self._handle is None:
            raise ValueError('File handle is not initialized')
        self._handle.seek(0)
        self._write_header()
        # v1.15+ has compression in header, older versions have it here
        if self.version < (1, 15, 0):
            self._write_int(
                int(self.compression_algorithm != constants.COMPRESSION_NONE)
            )

        self._write_timestamp(self.timestamp)
        self._write_str(self.dbname)
        self._write_str(self.server_version)
        self._write_str(self.dump_version)

version property

Return the version as a tuple to make version comparisons easier.

:rtype: tuple

add_entry(desc, namespace=None, tag=None, owner=None, defn=None, drop_stmt=None, copy_stmt=None, dependencies=None, tablespace=None, tableam=None, dump_id=None)

Add an entry to the dump

A :py:exc:ValueError will be raised if desc is not value that is known in :py:module:pgdumplib.constants.

The section is

When adding data, use :py:meth:~Dump.table_data_writer instead of invoking :py:meth:~Dump.add_entry directly.

If dependencies are specified, they will be validated and if a dump_id is specified and no entry is found with that dump_id, a :py:exc:ValueError will be raised.

Other omitted values will be set to the default values will be set to the defaults specified in the :py:class:pgdumplib.dump.Entry class.

The dump_id will be auto-calculated based upon the existing entries if it is not specified.

.. note:: The creation of ad-hoc blobs is not supported.

:param str desc: The entry description :param str namespace: The namespace of the entry :param str tag: The name/table/relation/etc of the entry :param str owner: The owner of the object in Postgres :param str defn: The DDL definition for the entry :param drop_stmt: A drop statement used to drop the entry before :param copy_stmt: A copy statement used when there is a corresponding data section. :param list dependencies: A list of dump_ids of objects that the entry is dependent upon. :param str tablespace: The tablespace to use :param str tableam: The table access method :param int dump_id: The dump id, will be auto-calculated if left empty :raises: :py:exc:ValueError :rtype: pgdumplib.dump.Entry

Source code in pgdumplib/dump.py
def add_entry(
    self,
    desc: str,
    namespace: str | None = None,
    tag: str | None = None,
    owner: str | None = None,
    defn: str | None = None,
    drop_stmt: str | None = None,
    copy_stmt: str | None = None,
    dependencies: list[int] | None = None,
    tablespace: str | None = None,
    tableam: str | None = None,
    dump_id: int | None = None,
) -> models.Entry:
    """Add an entry to the dump

    A :py:exc:`ValueError` will be raised if `desc` is not value that
    is known in :py:module:`pgdumplib.constants`.

    The section is

    When adding data, use :py:meth:`~Dump.table_data_writer` instead of
    invoking :py:meth:`~Dump.add_entry` directly.

    If ``dependencies`` are specified, they will be validated and if a
    ``dump_id`` is specified and no entry is found with that ``dump_id``,
    a :py:exc:`ValueError` will be raised.

    Other omitted values will be set to the default values will be set to
    the defaults specified in the :py:class:`pgdumplib.dump.Entry`
    class.

    The ``dump_id`` will be auto-calculated based upon the existing entries
    if it is not specified.

    .. note:: The creation of ad-hoc blobs is not supported.

    :param str desc: The entry description
    :param str namespace: The namespace of the entry
    :param str tag: The name/table/relation/etc of the entry
    :param str owner: The owner of the object in Postgres
    :param str defn: The DDL definition for the entry
    :param drop_stmt: A drop statement used to drop the entry before
    :param copy_stmt: A copy statement used when there is a corresponding
        data section.
    :param list dependencies: A list of dump_ids of objects that the entry
        is dependent upon.
    :param str tablespace: The tablespace to use
    :param str tableam: The table access method
    :param int dump_id: The dump id, will be auto-calculated if left empty
    :raises: :py:exc:`ValueError`
    :rtype: pgdumplib.dump.Entry

    """
    if desc not in constants.SECTION_MAPPING:
        raise ValueError(f'Invalid desc: {desc}')

    if dump_id is not None and dump_id < 1:
        raise ValueError('dump_id must be greater than 1')

    dump_ids = [e.dump_id for e in self.entries]

    if dump_id and dump_id in dump_ids:
        raise ValueError('dump_id {!r} is already assigned', dump_id)

    for dependency in dependencies or []:
        if dependency not in dump_ids:
            raise ValueError(
                f'Dependency dump_id {dependency!r} not found'
            )
    self.entries.append(
        models.Entry(
            dump_id=dump_id or self._next_dump_id(),
            had_dumper=False,
            table_oid='',
            oid='',
            tag=tag or '',
            desc=desc,
            defn=defn or '',
            drop_stmt=drop_stmt or '',
            copy_stmt=copy_stmt or '',
            namespace=namespace or '',
            tablespace=tablespace or None,
            tableam=tableam or None,
            relkind=None,
            owner=owner or '',
            with_oids=False,
            dependencies=dependencies or [],
        )
    )
    return self.entries[-1]

blobs()

Iterator that returns each blob in the dump

:rtype: tuple(int, bytes)

Source code in pgdumplib/dump.py
def blobs(self) -> typing.Generator[tuple[int, bytes], None, None]:
    """Iterator that returns each blob in the dump

    :rtype: tuple(int, bytes)

    """

    def read_oid(fd: io.BufferedReader) -> int | None:
        """Small helper function to deduplicate code"""
        try:
            return struct.unpack('I', fd.read(4))[0]
        except struct.error:
            return None

    for entry in self._data_entries:
        if entry.desc == constants.BLOBS:
            with self._tempfile(entry.dump_id, 'rb') as handle:
                oid: int | None = read_oid(handle)
                while oid:
                    length: int = struct.unpack('I', handle.read(4))[0]
                    yield oid, handle.read(length)
                    oid = read_oid(handle)

get_entry(dump_id)

Return the entry for the given dump_id

:param int dump_id: The dump ID of the entry to return.

Source code in pgdumplib/dump.py
def get_entry(self, dump_id: int) -> models.Entry | None:
    """Return the entry for the given `dump_id`

    :param int dump_id: The dump ID of the entry to return.

    """
    for entry in self.entries:
        if entry.dump_id == dump_id:
            return entry
    return None

load(path)

Load the Dumpfile, including extracting all data into a temporary directory

:param os.PathLike path: The path of the dump to load :raises: :py:exc:RuntimeError :raises: :py:exc:ValueError

Source code in pgdumplib/dump.py
def load(self, path: str | os.PathLike) -> typing.Self:
    """Load the Dumpfile, including extracting all data into a temporary
    directory

    :param os.PathLike path: The path of the dump to load
    :raises: :py:exc:`RuntimeError`
    :raises: :py:exc:`ValueError`

    """
    if not pathlib.Path(path).exists():
        raise ValueError(f'Path {path!r} does not exist')

    LOGGER.debug('Loading dump file from %s', path)

    self.entries = []  # Wipe out pre-existing entries
    self._handle = open(path, 'rb')
    self._read_header()
    if not constants.MIN_VER <= self.version <= constants.MAX_VER:
        raise ValueError(
            'Unsupported backup version: {}.{}.{}'.format(*self.version)
        )

    if self.version >= (1, 15, 0):
        self.compression_algorithm = constants.COMPRESSION_ALGORITHMS[
            self._compression_algorithm
        ]

        if (
            self.compression_algorithm
            not in constants.SUPPORTED_COMPRESSION_ALGORITHMS
        ):
            raise ValueError(
                'Unsupported compression algorithm: {}'.format(
                    *self.compression_algorithm
                )
            )
    else:
        self.compression_algorithm = (
            constants.COMPRESSION_GZIP
            if self._read_int() != 0
            else constants.COMPRESSION_NONE
        )

    self.timestamp = self._read_timestamp()
    self.dbname = self._read_bytes().decode(self.encoding)
    self.server_version = self._read_bytes().decode(self.encoding)
    self.dump_version = self._read_bytes().decode(self.encoding)

    self._read_entries()
    self._set_encoding()

    # Cache table data and blobs
    _last_pos = self._handle.tell()

    for entry in self._data_entries:
        if entry.data_state == constants.K_OFFSET_NO_DATA:
            continue
        elif entry.data_state != constants.K_OFFSET_POS_SET:
            raise RuntimeError('Unsupported data format')
        self._handle.seek(entry.offset, io.SEEK_SET)
        block_type, dump_id = self._read_block_header()
        if not dump_id or dump_id != entry.dump_id:
            raise RuntimeError(
                f'Dump IDs do not match ({dump_id} != {entry.dump_id}'
            )
        if block_type == constants.BLK_DATA:
            self._cache_table_data(dump_id)
        elif block_type == constants.BLK_BLOBS:
            self._cache_blobs(dump_id)
        else:
            raise RuntimeError(f'Unknown block type: {block_type!r}')
    return self

lookup_entry(desc, namespace, tag)

Return the entry for the given namespace and tag

:param str desc: The desc / object type of the entry :param str namespace: The namespace of the entry :param str tag: The tag/relation/table name :raises: :py:exc:ValueError :rtype: pgdumplib.dump.Entry or None

Source code in pgdumplib/dump.py
def lookup_entry(
    self, desc: str, namespace: str, tag: str
) -> models.Entry | None:
    """Return the entry for the given namespace and tag

    :param str desc: The desc / object type of the entry
    :param str namespace: The namespace of the entry
    :param str tag: The tag/relation/table name
    :raises: :py:exc:`ValueError`
    :rtype: pgdumplib.dump.Entry or None

    """
    if desc not in constants.SECTION_MAPPING:
        raise ValueError(f'Invalid desc: {desc}')
    for entry in [e for e in self.entries if e.desc == desc]:
        if entry.namespace == namespace and entry.tag == tag:
            return entry
    return None

save(path)

Save the Dump file to the specified path

:param path: The path to save the dump to :type path: str or os.PathLike

Source code in pgdumplib/dump.py
def save(self, path: str | os.PathLike) -> None:
    """Save the Dump file to the specified path

    :param path: The path to save the dump to
    :type path: str or os.PathLike

    """
    if self._handle is not None and not self._handle.closed:
        self._handle.close()
    self.compression_algorithm = constants.COMPRESSION_NONE
    self._handle = open(path, 'wb')
    self._save()
    self._handle.close()

table_data(namespace, table)

Iterator that returns data for the given namespace and table

:param str namespace: The namespace/schema for the table :param str table: The table name :raises: :py:exc:pgdumplib.exceptions.EntityNotFoundError

Source code in pgdumplib/dump.py
def table_data(
    self, namespace: str, table: str
) -> typing.Generator[str | tuple[typing.Any, ...], None, None]:
    """Iterator that returns data for the given namespace and table

    :param str namespace: The namespace/schema for the table
    :param str table: The table name
    :raises: :py:exc:`pgdumplib.exceptions.EntityNotFoundError`

    """
    for entry in self._data_entries:
        if entry.namespace == namespace and entry.tag == table:
            for row in self._read_table_data(entry.dump_id):
                yield self._converter.convert(row)
            return
    raise exceptions.EntityNotFoundError(namespace=namespace, table=table)

table_data_writer(entry, columns)

A context manager that is used to return a :py:class:~pgdumplib.dump.TableData instance, which can be used to add table data to the dump.

When invoked for a given entry containing the table definition,

:param Entry entry: The entry for the table to add data for :param columns: The ordered list of table columns :type columns: list or tuple :rtype: TableData

Source code in pgdumplib/dump.py
@contextlib.contextmanager
def table_data_writer(
    self, entry: models.Entry, columns: abc.Sequence
) -> typing.Generator[TableData, None, None]:
    """A context manager that is used to return a
    :py:class:`~pgdumplib.dump.TableData` instance, which can be used
    to add table data to the dump.

    When invoked for a given entry containing the table definition,

    :param Entry entry: The entry for the table to add data for
    :param columns: The ordered list of table columns
    :type columns: list or tuple
    :rtype: TableData

    """
    if entry.dump_id not in self._writers.keys():
        dump_id = self._next_dump_id()
        self.entries.append(
            models.Entry(
                dump_id=dump_id,
                had_dumper=True,
                tag=entry.tag,
                desc=constants.TABLE_DATA,
                copy_stmt='COPY {}.{} ({}) FROM stdin;'.format(
                    entry.namespace, entry.tag, ', '.join(columns)
                ),
                namespace=entry.namespace,
                owner=entry.owner,
                dependencies=[entry.dump_id],
                data_state=constants.K_OFFSET_POS_NOT_SET,
            )
        )
        self._writers[entry.dump_id] = TableData(
            dump_id, self._temp_dir.name, self.encoding
        )
    yield self._writers[entry.dump_id]
    return None