reviewboard.diffviewer.diffutils¶
-
CHUNK_RANGE_RE
= <_sre.SRE_Pattern object>[source]¶ A regex for matching a diff chunk header.
New in version 3.0.18.
-
convert_to_unicode
(s, encoding_list)[source]¶ Return the passed string as a unicode object.
If conversion to unicode fails, we try the user-specified encoding, which defaults to ISO 8859-15. This can be overridden by users inside the repository configuration, which gives users repository-level control over file encodings.
Ideally, we’d like to have per-file encodings, but this is hard. The best we can do now is a comma-separated list of things to try.
Returns the encoding type which was used and the decoded unicode object.
Parameters: Returns: A tuple with the following information:
- A compatible encoding (
unicode
). - The Unicode data (
unicode
).
Return type: Raises: TypeError
– The provided value was not a Unicode string, byte string, or a byte array.UnicodeDecodeError
– None of the encoding types were valid for the provided string.
- A compatible encoding (
-
convert_line_endings
(data)[source]¶ Convert line endings in a file.
Some types of repositories provide files with a single trailing Carriage Return (
\r
), even if the rest of the file used a CRLF (\r\n
) throughout. In these cases, GNU diff will add a\ No newline at end of file
to the end of the diff, which GNU patch understands and will apply to files with just a trailing\r
.However, we normalize
\r
to\n
, which breaks GNU patch in these cases. This function works around this by removing the last\r
and then converting standard types of newlines to a\n
.This is not meant for use in providing byte-compatible versions of files, but rather to help with comparing lines-for-lines in situations where two versions of a file may come from different platforms with different newlines.
Parameters: data (bytes or unicode) – A string to normalize. This supports either byte strings or Unicode strings. Returns: The data with newlines converted, in the original string type. Return type: bytes or unicode Raises: TypeError
– Thedata
argument provided is not a byte string or Unicode string.
-
split_line_endings
(data)[source]¶ Split a string into lines while preserving all non-CRLF characters.
Unlike
str.splitlines()
, this will only split on the following character sequences:\n
,\r
,\r\n
, and\r\r\n
.This is needed to prevent the sort of issues encountered with Unicode strings when calling
str.splitlines`()
, which is that form feed characters would be split. patch and diff accept form feed characters as valid characters in diffs, and doesn’t treat them as newlines, butstr.splitlines()
will treat it as a newline anyway.Parameters: data (bytes or unicode) – The data to split into lines. Returns: The list of lines. Return type: list of bytes or unicode
-
patch
(diff, orig_file, filename, request=None)[source]¶ Apply a diff to a file.
This delegates out to
patch
because noone except Larry Wall knows how to patch.Parameters: - diff (bytes) – The contents of the diff to apply.
- orig_file (bytes) – The contents of the original file.
- filename (unicode) – The name of the file being patched.
- request (django.http.HttpRequest, optional) – The HTTP request, for use in logging.
Returns: The contents of the patched file.
Return type: Raises: reviewboard.diffutils.errors.PatchError
– An error occurred when trying to apply the patch.
-
get_original_file_from_repo
(filediff, request=None, encoding_list=None)[source]¶ Return the pre-patched file for the FileDiff from the repository.
The parent diff will be applied if it exists.
New in version 4.0.
Parameters: - filediff (reviewboard.diffviewer.models.filediff.FileDiff) – The FileDiff to retrieve the pre-patch file for.
- request (django.http.HttpRequest, optional) – The HTTP request from the client.
- encoding_list (list of unicode, optional) –
A custom list of encodings to try when processing the file. This will override the encoding list normally retrieved from the FileDiff and repository.
If there’s already a known valid encoding for the file, it will be used instead.
This is here for compatibility and will be removed in Review Board 5.0.
Returns: The pre-patched file.
Return type: Raises: UnicodeDecodeError
– The source file was not compatible with any of the available encodings.reviewboard.diffutils.errors.PatchError
– An error occurred when trying to apply the patch.reviewboard.scmtools.errors.SCMError
– An error occurred while computing the pre-patch file.
-
get_original_file
(filediff, request=None, encoding_list=None)[source]¶ Return the pre-patch file of a FileDiff.
Changed in version 4.0: The
encoding_list
parameter should no longer be provided by callers. Encoding lists are now calculated automatically. Passing a custom list will override the calculated one.Parameters: - filediff (reviewboard.diffviewer.models.filediff.FileDiff) – The FileDiff to retrieve the pre-patch file for.
- request (django.http.HttpRequest, optional) – The HTTP request from the client.
- encoding_list (list of unicode, optional) –
A custom list of encodings to try when processing the file. This will override the encoding list normally retrieved from the FileDiff and repository.
If there’s already a known valid encoding for the file, it will be used instead.
Returns: The pre-patch file.
Return type: Raises: UnicodeDecodeError
– The source file was not compatible with any of the available encodings.reviewboard.diffutils.errors.PatchError
– An error occurred when trying to apply the patch.reviewboard.scmtools.errors.SCMError
– An error occurred while computing the pre-patch file.
-
get_patched_file
(source_data, filediff, request=None)[source]¶ Return the patched version of a file.
This will normalize the patch, applying any changes needed for the repository, and then patch the provided data with the patch contents.
Parameters: - source_data (bytes) – The file contents to patch.
- filediff (reviewboard.diffviewer.models.filediff.FileDiff) – The FileDiff representing the patch.
- request (django.http.HttpClient, optional) – The HTTP request from the client.
Returns: The patched file contents.
Return type:
-
get_filenames_match_patterns
(patterns, filenames)[source]¶ Return whether any of the filenames match any of the patterns.
This is used to compare a list of filenames to a list of
patterns
. The patterns are case-sensitive.Parameters: - patterns (list of unicode) – The list of patterns to match against.
- filename (list of unicode) – The list of filenames.
Returns: True
if any filenames match any patterns.False
if none match.Return type:
-
get_filediff_encodings
(filediff, encoding_list=None)[source]¶ Return a list of encodings to try for a FileDiff’s source text.
If the FileDiff already has a known encoding stored, then it will take priority. The provided encoding list, or the repository’s list of configured encodingfs, will be provided as fallbacks.
Parameters: - filediff (reviewboard.diffviewer.models.filediff.FileDiff) – The FileDiff to return encodings for.
- encoding_list (list of unicode, optional) – An explicit list of encodings to try. If not provided, the repository’s list of encodings will be used instead (which is generally preferred).
Returns: The list of encodings to try for the source file.
Return type: list of unicode
-
get_matched_interdiff_files
(tool, filediffs, interfilediffs)[source]¶ Generate pairs of matched files for display in interdiffs.
This compares a list of filediffs and a list of interfilediffs, attempting to best match up the files in both for display in the diff viewer.
This will prioritize matches that share a common source filename, destination filename, and new/deleted state. Failing that, matches that share a common source filename are paired off.
Any entries in
interfilediffs` that don't have any match in ``filediffs
are considered new changes in the interdiff, and any entries infilediffs
that don’t have entries ininterfilediffs
are considered reverted changes.Parameters: - tool (reviewboard.scmtools.core.SCMTool) – The tool used for all these diffs.
- filediffs (list of reviewboard.diffviewer.models.filediff.FileDiff) – The list of filediffs on the left-hand side of the diff range.
- interfilediffs (list of reviewboard.diffviewer.models.filediff.FileDiff) – The list of filediffs on the right-hand side of the diff range.
Yields: tuple – A paired off filediff match. This is a tuple containing two entries, each a
FileDiff
orNone
.
-
get_filediffs_match
(filediff1, filediff2)[source]¶ Return whether two FileDiffs effectively match.
This is primarily checking that the patched version of two files are going to be basically the same.
This will first check that we even have both FileDiffs. Assuming we have both, this will check the diff for equality. If not equal, we at least check that both files were deleted (which is equivalent to being equal).
The patched SHAs are then checked. These would be generated as part of the diff viewing process, so may not be available. We prioritize the SHA256 hashes (introduced in Review Board 4.0), and fall back on SHA1 hashes if not present.
Parameters: - filediff1 (reviewboard.diffviewer.models.filediff.FileDiff) – The first FileDiff to compare.
- filediff2 (reviewboard.diffviewer.models.filediff.FileDiff) – The second FileDiff to compare.
Returns: True
if both FileDiffs effectively match.False
if they do not.Return type: Raises: ValueError
–None
was provided for bothfilediff1
andfilediff2
.
-
get_diff_files
(diffset, filediff=None, interdiffset=None, interfilediff=None, base_filediff=None, request=None, filename_patterns=None, base_commit=None, tip_commit=None)[source]¶ Return a list of files that will be displayed in a diff.
This will go through the given diffset/interdiffset, or a given filediff within that diffset, and generate the list of files that will be displayed. This file list will contain a bunch of metadata on the files, such as the index, original/modified names, revisions, associated filediffs/diffsets, and so on.
This can be used along with
populate_diff_chunks()
to build a full list containing all diff chunks used for rendering a side-by-side diff.Parameters: - diffset (reviewboard.diffviewer.models.diffset.DiffSet) – The diffset containing the files to return.
- filediff (reviewboard.diffviewer.models.filediff.FileDiff, optional) – A specific file in the diff to return information for.
- interdiffset (reviewboard.diffviewer.models.diffset.DiffSet, optional) – A second diffset used for an interdiff range.
- interfilediff (reviewboard.diffviewer.models.filediff.FileDiff, optional) –
A second specific file in
interdiffset
used to return information for. This should be provided iffilediff
andinterdiffset
are both provided. If it’sNone
in this case, then the diff will be shown as reverted for this file.This may not be provided if
base_filediff
is provided. - base_filediff (reviewbaord.diffviewer.models.filediff.FileDiff, optional) –
The base FileDiff to use.
This may only be provided if
filediff
is provided andinterfilediff
is not. - filename_patterns (list of unicode, optional) – A list of filenames or
patterns
used to limit the results. Each of these will be matched against the original and modified file of diffs and interdiffs. - base_commit (reviewboard.diffviewer.models.diffcommit.DiffCommit, optional) –
An optional base commit. No
FileDiffs
from commits before that commit will be included in the results.This argument only applies to
DiffSets
withDiffCommits
. - tip_commit (reviewboard.diffviewer.models.diffcommit.DiffSet, optional) –
An optional tip commit. No
FileDiffs
from commits after that commit will be included in the results.This argument only applies to
DiffSets
withDiffCommits
.
Returns: A list of dictionaries containing information on the files to show in the diff, in the order in which they would be shown.
Return type: list of dict
-
populate_diff_chunks
(files, enable_syntax_highlighting=True, request=None)[source]¶ Populates a list of diff files with chunk data.
This accepts a list of files (generated by get_diff_files) and generates diff chunk data for each file in the list. The chunk data is stored in the file state.
-
get_file_from_filediff
(context, filediff, interfilediff)[source]¶ Return the files that corresponds to the filediff/interfilediff.
This is primarily intended for use with templates. It takes a RequestContext for looking up the user and for caching file lists, in order to improve performance and reduce lookup times for files that have already been fetched.
This function returns either exactly one file or
None
.
-
get_last_line_number_in_diff
(context, filediff, interfilediff)[source]¶ Determine the last virtual line number in the filediff/interfilediff.
This returns the virtual line number to be used in expandable diff fragments.
-
get_last_header_before_line
(context, filediff, interfilediff, target_line)[source]¶ Get the last header that occurs before the given line.
This returns a dictionary of
left
header andright
header. Each header is eitherNone
or a dictionary with the following fields:Field Description line
Virtual line number (union of the original and patched files) text
The header text
-
get_file_chunks_in_range
(context, filediff, interfilediff, first_line, num_lines)[source]¶ Generate the chunks within a range of lines in the specified filediff.
This is primarily intended for use with templates. It takes a RequestContext for looking up the user and for caching file lists, in order to improve performance and reduce lookup times for files that have already been fetched.
See
get_chunks_in_range()
for information on the returned state of the chunks.
-
get_chunks_in_range
(chunks, first_line, num_lines)[source]¶ Generate the chunks within a range of lines of a larger list of chunks.
This takes a list of chunks, computes a subset of those chunks from the line ranges provided, and generates a new set of those chunks.
Each returned chunk is a dictionary with the following fields:
Variable Description change
The change type (“equal”, “replace”, “insert”, “delete”) numlines
The number of lines in the chunk. lines
The list of lines in the chunk. meta
A dictionary containing metadata on the chunk Each line in the list of lines is an array with the following data:
Index Description 0 Virtual line number (union of the original and patched files) 1 Real line number in the original file 2 HTML markup of the original file 3 Changed regions of the original line (for “replace” chunks) 4 Real line number in the patched file 5 HTML markup of the patched file 6 Changed regions of the patched line (for “replace” chunks) 7 True if line consists of only whitespace changes
-
get_line_changed_regions
(oldline, newline)[source]¶ Returns regions of changes between two similar lines.
-
get_sorted_filediffs
(filediffs, key=None)[source]¶ Sorts a list of filediffs.
The list of filediffs will be sorted first by their base paths in ascending order.
Within a base path, they’ll be sorted by base name (minus the extension) in ascending order.
If two files have the same base path and base name, we’ll sort by the extension in descending order. This will make
*.h
sort ahead of*.c
/*.cpp
, for example.If the list being passed in is actually not a list of FileDiffs, it must provide a callable
key
parameter that will return a FileDiff for the given entry in the list. This will only be called once per item.
-
get_displayed_diff_line_ranges
(chunks, first_vlinenum, last_vlinenum)[source]¶ Return the displayed line ranges based on virtual line numbers.
This takes the virtual line numbers (the index in the side-by-side diff lines) and returns the human-readable line numbers, the chunks they’re in, and mapped virtual line numbers.
A virtual line range may start or end in a chunk not containing displayed line numbers (such as an “original” range starting/ending in an “insert” chunk). The resulting displayed line ranges will exclude these chunks.
Parameters: Returns: A tuple of displayed line range information, containing 2 items.
Each item will either be a dictionary of information, or
None
if there aren’t any displayed lines to show.The dictionary contains the following keys:
display_range
:A tuple containing the displayed line range.
virtual_range
:A tuple containing the virtual line range that
display_range
maps to.chunk_range
:A tuple containing the beginning/ending chunks that
display_range
maps to.
Return type: Raises: ValueError
– The range provided was invalid.
-
get_diff_data_chunks_info
(diff)[source]¶ Return information on each chunk in a diff.
This will scan through a unified diff file, looking for each chunk in the diff and returning information on their ranges and lines of context. This can be used to generate statistics on diffs and help map changed regions in diffs to lines of source files.
New in version 3.0.18.
Parameters: diff (bytes) – The diff data to scan. Returns: A list of chunk information dictionaries. Each entry has an orig
andmodified
dictionary containing the following keys:chunk_start
(int
):- The starting line number of the chunk shown in the diff, including any lines of context. This is 0-based.
chunk_len
(int
):- The length of the chunk shown in the diff, including any lines of context.
changes_start
(int
):- The starting line number of a range of changes shown in a chunk in the diff. This is after any lines of context and is 0-based.
changes_len
(int
):- The length of the changes shown in a chunk in the diff, excluding any lines of context.
pre_lines_of_context
(int
):- The number of lines of context before any changes in a chunk. If the chunk doesn’t have any changes, this will contain all lines of context otherwise shown around changes in the other region in this entry.
post_lines_of_context
(int
):- The number of lines of context after any changes in a chunk. If the chunk doesn’t have any changes, this will be 0.
Return type: list of dict
-
check_diff_size
(diff_file, parent_diff_file=None)[source]¶ Check the size of the given diffs against the maximum allowed size.
If either of the provided diffs are too large, an exception will be raised.
Parameters: - diff_file (django.core.files.uploadedfile.UploadedFile) – The diff file.
- parent_diff_file (django.core.files.uploadedfile.UploadedFile, optional) – The parent diff file, if any.
Raises: reviewboard.diffviewer.errors.DiffTooBigError
– The supplied files are too big.
-
get_total_line_counts
(files_qs)[source]¶ Return the total line counts of all given FileDiffs.
Parameters: files_qs (django.db.models.query.QuerySet) – The queryset descripting the FileDiffs
.Returns: A dictionary with the following keys: raw_insert_count
raw_delete_count
insert_count
delete_count
replace_count
equal_count
total_line_count
Each entry maps to the sum of that line count type for all
FileDiffs
.Return type: dict