reviewboard.diffviewer.parser¶
Diff parsing support.
- class ParsedDiff(parser, uses_commit_ids_as_revisions=False)[source]¶
Bases:
object
Parsed information from a diff.
This stores information on the diff as a whole, along with a list of commits made to the diff and a list of files within each.
Extra data can be stored by the parser, which will be made available in
DiffSet.extra_data
.This is flexible enough to accommodate a variety of diff formats, including DiffX files.
This class is meant to be used internally and by subclasses of
BaseDiffParser
.New in version 4.0.5.
- changes¶
The list of changes parsed in this diff. There should always be at least one.
- Type:
- extra_data¶
Extra data to store along with the information on the diff. The contents will be stored directly in
DiffSet.extra_data
.- Type:
- parser¶
The diff parser that parsed this file.
- Type:
- uses_commit_ids_as_revisions¶
Whether commit IDs are used as file revisions.
A commit ID will be used if an explicit revision isn’t available for a file. For instance, if a parent diff is available, and a file isn’t present in the parent diff, the file will use the parent diff’s parent commit ID as the parent revision.
- Type:
- __init__(parser, uses_commit_ids_as_revisions=False)[source]¶
Initialize the parsed diff information.
- Parameters:
parser (
BaseDiffParser
) – The diff parser that parsed this file.uses_commit_ids_as_revisions (
bool
, optional) –Whether commit IDs are used as file revisions.
- class ParsedDiffChange(parsed_diff)[source]¶
Bases:
object
Parsed change information from a diff.
This stores information on a change to a tree, consisting of a set of parsed files and extra data to store (in
DiffCommit.extra_data
.This will often map to a commit, or just a typical collection of files in a diff. Traditional diffs will have only one of these. DiffX files may have many (but for the moment, only diffs with a single change can be handled when processing these results).
New in version 4.0.5.
- extra_data¶
Extra data to store along with the information on the change. The contents will be stored directly in
DiffCommit.extra_data
.- Type:
- files¶
The list of files parsed for this change. There should always be at least one.
- Type:
- commit_id: TypedProperty[Optional[bytes], Optional[bytes]]¶
The ID of the commit, parsed from the diff.
This may be
None
.- Type:
- parent_commit_id: TypedProperty[Optional[bytes], Optional[bytes]]¶
The ID of the primary parent commit, parsed from the diff.
This may be
None
.- Type:
- __init__(parsed_diff)[source]¶
Initialize the parsed diff information.
- Parameters:
parsed_diff (
ParsedDiff
) – The parent parsed diff information.
- __annotations__ = {'commit_id': '_BytesProperty', 'parent_commit_id': '_BytesProperty'}¶
- class ParsedDiffFile(parser=None, parsed_diff_change=None, **kwargs)[source]¶
Bases:
object
A parsed file from a diff.
This stores information on a single file represented in a diff, including the contents of that file’s diff, as parsed by
DiffParser
or one of its subclasses.Parsers should set the attributes on this based on the contents of the diff, and should add any data found in the diff.
This class is meant to be used internally and by subclasses of
BaseDiffParser
.Changed in version 4.0.6: Added
old_symlink_target
and py:attr:new_symlink_target.Changed in version 4.0.5: Diff parsers that manually construct instances must pass in
parsed_diff_change
instead ofparser
when constructing the object, and must calldiscard()
after construction if the file isn’t wanted in the results.- copied¶
Whether this represents a file that has been copied. The file may or may not be modified in the process.
- Type:
- moved¶
Whether this represents a file that has been moved/renamed. The file may or may not be modified in the process.
- Type:
- parser¶
The diff parser that parsed this file.
- Type:
- skip¶
Whether this file should be skipped by the parser. If any of the parser methods set this, the file will stop parsing and will be excluded from results.
- Type:
- orig_filename: TypedProperty[Optional[bytes], Optional[bytes]]¶
The parsed original name of the file.
- Type:
- orig_file_details: TypedProperty[Optional[Union[bytes, Revision]], Optional[Union[bytes, Revision]]]¶
The parsed file details of the original file.
This will usually be a revision.
- Type:
- modified_filename: TypedProperty[Optional[bytes], Optional[bytes]]¶
The parsed modified name of the file.
This may be the same as
orig_filename
.- Type:
- modified_file_details: TypedProperty[Optional[Union[bytes, Revision]], Optional[Union[bytes, Revision]]]¶
The parsed file details of the modified file.
This will usually be a revision.
- Type:
- index_header_value: TypedProperty[Optional[bytes], Optional[bytes]]¶
The parsed value for an Index header.
If present in the diff, this usually contains a filename, but may contain other content as well, depending on the variation of the diff format.
- Type:
- old_symlink_target: TypedProperty[Optional[bytes], Optional[bytes]]¶
The old target for a symlink.
New in version 4.0.6.
- Type:
- new_symlink_target: TypedProperty[Optional[bytes], Optional[bytes]]¶
The new target for a symlink.
New in version 4.0.6.
- Type:
- old_unix_mode: TypedProperty[Optional[str], Optional[str]]¶
The old UNIX mode for the file.
New in version 4.0.6.
- Type:
- new_unix_mode: TypedProperty[Optional[str], Optional[str]]¶
The new UNIX mode for the file.
New in version 4.0.6.
- Type:
- __init__(parser=None, parsed_diff_change=None, **kwargs)[source]¶
Initialize the parsed file information.
Changed in version 4.0.5: Added the
parsed_diff_change
argument (which will be required in Review Board 6.0).Deprecated the
parser
argument (which will be removed in Review Board 6.0).- Parameters:
parser (
reviewboard.diffviewer.parser.BaseDiffParser
, optional) –The diff parser that parsed this file.
This is deprecated and will be remoed in Review Board 6.0.
parsed_diff_change (
ParsedDiffChange
, optional) –The diff change that owns this file.
This will be required in Review Board 6.0.
- property data[source]¶
The data for this diff.
This must be accessed after
finalize()
has been called.
- discard()[source]¶
Discard this from the parent change.
This will remove it from the list of files. It’s intended for use when a diff parser is populating the diff but then determines the file is no longer needed.
New in version 4.0.5.
- finalize()[source]¶
Finalize the parsed diff.
This makes the diff data available to consumers and closes the buffer for writing.
- prepend_data(data)[source]¶
Prepend data to the buffer.
- Parameters:
data (
bytes
) – The data to prepend.
- append_data(data)[source]¶
Append data to the buffer.
- Parameters:
data (
bytes
) – The data to append.
- __annotations__ = {'index_header_value': '_BytesProperty', 'modified_file_details': '_RevisionProperty', 'modified_filename': '_BytesProperty', 'new_symlink_target': '_BytesProperty', 'new_unix_mode': '_StrProperty', 'old_symlink_target': '_BytesProperty', 'old_unix_mode': '_StrProperty', 'orig_file_details': '_RevisionProperty', 'orig_filename': '_BytesProperty'}¶
- class BaseDiffParser(data, uses_commit_ids_as_revisions=False)[source]¶
Bases:
object
Base class for a diff parser.
This is a low-level, basic foundational interface for a diff parser. It performs type checking of the incoming data and a couple of methods for subclasses to implement.
Most SCM implementations will want to either subclass
DiffParser
or useDiffXParser
.New in version 4.0.5.
- uses_commit_ids_as_revisions¶
Whether commit IDs are used as file revisions.
See
ParsedDiff.uses_commit_ids_as_revisions
.- Type:
- __init__(data, uses_commit_ids_as_revisions=False)[source]¶
Initialize the parser.
- parse_diff()[source]¶
Parse the diff.
This will parse the content of the file, returning a representation of the diff file and its content.
This must be implemented by subclasses.
- Returns:
The resulting parsed diff information.
- Return type:
- Raises:
NotImplementedError – This wasn’t implemented by a subclass.
reviewboard.diffviewer.errors.DiffParserError – There was an error parsing part of the diff. This may be a corrupted diff, or an error in the parsing implementation. Details are in the error message.
- raw_diff(diffset_or_commit)[source]¶
Return a raw diff as a string.
This takes a DiffSet or DiffCommit and generates a new, single diff file that represents all the changes made. It’s used to regenerate a diff and serve it up for other tools or processes to use.
This must be implemented by subclasses.
- Parameters:
diffset_or_commit (
reviewboard.diffviewer.models.diffset.DiffSet
or :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:``reviewboard.diffviewer.models.diffcommit.DiffCommit
) –The DiffSet or DiffCommit to render.
If passing in a DiffSet, only the cumulative diff’s file contents will be returned.
If passing in a DiffCommit, only that commit’s file contents will be returned.
- Returns:
The diff composed of all the component FileDiffs.
- Return type:
- Raises:
NotImplementedError – This wasn’t implemented by a subclass.
TypeError – The provided
diffset_or_commit
wasn’t of a supported type.
- normalize_diff_filename(filename)[source]¶
Normalize filenames in diffs.
This returns a normalized filename suitable for populating in
FileDiff.source_file
orFileDiff.dest_file
, or for when presenting a filename to the UI.By default, this strips off any leading slashes, which might occur due to differences in various diffing methods or APIs.
Subclasses can override this to provide additional methods of normalization.
- class DiffParser(data, **kwargs)[source]¶
Bases:
BaseDiffParser
Parses diff files, allowing subclasses to specialize parsing behavior.
This class provides the base functionality for parsing Unified Diff files. It looks for common information present in many variations of diffs, such as
Index:
lines, in order to extract files and their modified content from a diff.Subclasses can extend the parsing behavior to extract additional metadata or handle special representations of changes. They may want to override the following methods:
normalize_diff_filename()
- INDEX_SEP = b'==================================================================='[source]¶
A separator string below an Index header.
This is commonly found immediately below an
Index:
header, meant to help locate the beginning of the metadata or changes made to a file.Its presence and location is not guaranteed.
- parse_diff()[source]¶
Parse the diff.
Subclasses should override this if working with a diff format that extracts more than one change from a diff.
New in version 4.0.5: Historically,
parse()
was the main method used to parse a diff. That’s now used exclusively to parse a list of files for the defaultparsed_diff_change
. The old method is around for compatibility, but is no longer called directly outside of this class.- Returns:
The resulting parsed diff information.
- Return type:
- Raises:
reviewboard.diffviewer.errors.DiffParserError – There was an error parsing part of the diff. This may be a corrupted diff, or an error in the parsing implementation. Details are in the error message.
- parse()[source]¶
Parse the diff and return a list of files.
This will parse the content of the file, returning any files that were found.
- Version Change:
4.0.5: Historically, this was the main method used to parse a diff. It’s now used exclusively to parse a list of files for the default
parsed_diff_change
, andparse_diff()
is the main method used to parse a diff. This method is around for compatibility, but is no longer called directly outside of this class.
- Returns:
The resulting list of files.
- Return type:
- Raises:
reviewboard.diffviewer.errors.DiffParserError – There was an error parsing part of the diff. This may be a corrupted diff, or an error in the parsing implementation. Details are in the error message.
- parse_diff_line(linenum, parsed_file)[source]¶
Parse a line of data in a diff.
This will append the line to the parsed file’s data, and if the content represents active changes to a file, its insert/delete counts will be updated to reflect them.
- Parameters:
linenum (
int
) – The 0-based line number.parsed_file (
ParsedDiffFile
) – The current parsed diff file info.
- Returns:
The next line number to parse.
- Return type:
- parse_change_header(linenum)[source]¶
Parse a header before a change to a file.
This will attempt to parse the following information, starting at the specified line in the diff:
Any special file headers (such as
Index:
lines) throughparse_special_header()
A standard Unified Diff file header (through
parse_diff_header()
)Any content after the header (through
parse_after_headers()
)
If the special or diff headers are able to populate the original and modified filenames and revisions/file details, and none of the methods above mark the file as skipped (by setting
ParsedDiffFile.skip
), then this will finish by appending all parsed data and returning a parsed file entry.Subclasses that need to control parsing logic should override one or more of the above methods.
- Parameters:
linenum (
int
) – The line number to begin parsing.- Returns:
A tuple containing the following:
The next line number to parse
The populated
ParsedDiffFile
instance for this file
- Return type:
- Raises:
reviewboard.diffviewer.errors.DiffParserError – There was an error parsing the change header. This may be a corrupted diff, or an error in the parsing implementation. Details are in the error message.
- parse_special_header(linenum, parsed_file)[source]¶
Parse a special diff header marking the start of a new file’s info.
This attempts to locate an
Index:
line at the specified line number, which usually indicates the beginning of file’s information in a diff (for Unified Diff variants that support it). By default, this method expects the line to be found atlinenum
.If present, the value found immediately after the
Index:
will be stored inParsedDiffFile.index_header_value
, allowing subclasses to make a determination based on its contents (which may vary between types of diffs, but should include at least a filename.If the
Index:
line is not present, this won’t do anything by default.Subclasses can override this to parse additional information before the standard diff header. They may also set
ParsedFileDiff.skip
to skip the rest of this file and begin parsing a new entry at the returned line number.- Parameters:
linenum (
int
) – The line number to begin parsing.parsed_file (
ParsedDiffFile
) – The file currently being parsed.
- Returns:
The next line number to parse.
- Return type:
- Raises:
reviewboard.diffviewer.errors.DiffParserError – There was an error parsing the special header. This may be a corrupted diff, or an error in the parsing implementation. Details are in the error message.
- parse_diff_header(linenum, parsed_file)[source]¶
Parse a standard header before changes made to a file.
This attempts to parse the
---
(original) and+++
(modified) file lines, which are usually present right before any changes to the file. By default, this method expects the---
line to be found atlinenum
.If found, this will populate
ParsedDiffFile.orig_filename
,ParsedDiffFile.orig_file_details
,ParsedDiffFile.modified_filename
, andParsedDiffFile.modified_file_details
.This calls out to
parse_filename_header()
to help parse the contents immediately after the---
or+++
.Subclasses can override this to parse these lines differently, or to to process the results of these lines (such as converting special filenames to states like “deleted” or “new file”). They may also set
ParsedFileDiff.skip
to skip the rest of this file and begin parsing a new entry at the returned line number.- Parameters:
linenum (
int
) – The line number to begin parsing.parsed_file (
ParsedDiffFile
) – The file currently being parsed.
- Returns:
The next line number to parse.
- Return type:
- Raises:
reviewboard.diffviewer.errors.DiffParserError – There was an error parsing the diff header. This may be a corrupted diff, or an error in the parsing implementation. Details are in the error message.
- parse_after_headers(linenum, parsed_file)[source]¶
Parse information after a diff header but before diff data.
This attempts to parse the information found after
parse_diff_headers()
is called, but before gathering any lines that are part of the diff contents. It’s intended for the few diff formats that may place content at this location.By default, this does nothing.
Subclasses can override this to provide custom parsing of any lines that may exist here. They may also set
ParsedFileDiff.skip
to skip the rest of this file and begin parsing a new entry at the returned line number.- Parameters:
linenum (
int
) – The line number to begin parsing.parsed_file (
ParsedDiffFile
) – The file currently being parsed.
- Returns:
The next line number to parse.
- Return type:
- Raises:
reviewboard.diffviewer.errors.DiffParserError – There was an error parsing the diff header. This may be a corrupted diff, or an error in the parsing implementation. Details are in the error message.
- parse_filename_header(s, linenum)[source]¶
Parse the filename found in a diff filename line.
This parses the value after a
---
or+++
indicator (or a special variant handled by a subclass), normalizing the filename and any following file details, and returning both for processing and storage.Often times, the file details will be a revision for the original file, but this is not guaranteed, and is up to the variation of the diff format.
By default, this will assume that a filename and file details are separated by either a single tab, or two or more spaces. If neither are found, this will fail to parse.
This must parse only the provided value, and cannot parse subsequent lines.
Subclasses can override this behavior to parse these lines another way, or to normalize filenames (handling escaping or filenames with spaces as needed by that particular diff variation).
- Parameters:
- Returns:
A tuple containing:
The filename (as bytes)
The additional file information (as bytes)
- Return type:
- Raises:
reviewboard.diffviewer.errors.DiffParserError – There was an error parsing the diff header. This may be a corrupted diff, or an error in the parsing implementation. Details are in the error message.
- raw_diff(diffset_or_commit)[source]¶
Return a raw diff as a string.
This takes a DiffSet or DiffCommit and generates a new, single diff file that represents all the changes made. It’s used to regenerate a diff and serve it up for other tools or processes to use.
Subclasses can override this to provide any special logic for building the diff.
- Parameters:
diffset_or_commit (
reviewboard.diffviewer.models.diffset.DiffSet
or :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:``reviewboard.diffviewer.models.diffcommit.DiffCommit
) –The DiffSet or DiffCommit to render.
If passing in a DiffSet, only the cumulative diff’s file contents will be returned.
If passing in a DiffCommit, only that commit’s file contents will be returned.
- Returns:
The diff composed of all the component FileDiffs.
- Return type:
- Raises:
TypeError – The provided
diffset_or_commit
wasn’t of a supported type.
- __annotations__ = {}¶
- class DiffXParser(data, uses_commit_ids_as_revisions=False)[source]¶
Bases:
BaseDiffParser
Parser for DiffX files.
This will parse files conforming to the DiffX standard, storing the diff content provided in each file section, as well as all the information available in each DiffX section (options, preamble, metadata) as
extra_data
. This allows the diffs to be re-built on download.This parser is sufficient for most any DiffX need, but subclasses can be created that augment the stored
extra_data
for any of the parsed objects.New in version 4.0.5: This is experimental in 4.0.x, with plans to make it stable for 5.0. The API may change during this time.
- parse_diff()[source]¶
Parse the diff.
This will parse the content of the DiffX file, returning a representation of the diff file and its content.
- Returns:
The resulting parsed diff information.
- Return type:
- Raises:
reviewboard.diffviewer.errors.DiffParserError – There was an error parsing part of the diff. This may be a corrupted diff, or an error in the parsing implementation. Details are in the error message.
- raw_diff(diffset_or_commit)[source]¶
Return a raw diff as a string.
This takes a
DiffSet
orDiffCommit
and generates a new, single DiffX file that represents all the changes made, based on the previously-stored DiffX information inextra_data
dictionaries. It’s used to regenerate a DiffX and serve it up for other tools or processes to use.- Parameters:
diffset_or_commit (
reviewboard.diffviewer.models.diffset.DiffSet
or :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:`` :class:``reviewboard.diffviewer.models.diffcommit.DiffCommit
) –The DiffSet or DiffCommit to render.
If passing in a DiffSet, the full uploaded DiffX file contents will be returned.
If passing in a DiffCommit, a new DiffX representing only that commit’s contents will be returned. This will lack the main preamble or metadata, or any other changes previously in the DiffX file.
- Returns:
The resulting DiffX file contents.
- Return type:
- Raises:
TypeError – The provided
diffset_or_commit
value wasn’t of a supported type.
- __annotations__ = {}¶