Commit 043ed2d
authored
fix(pptx): handle NotImplementedError from shape.shape_type (#3309)
* fix(pptx): handle NotImplementedError from shape.shape_type
python-pptx raises NotImplementedError from Shape.shape_type for
<p:sp> elements that aren't placeholders, autoshapes, textboxes, or
freeforms (e.g. shapes with empty <p:spPr> from Google Slides exports,
LibreOffice, or Keynote). handle_groups() and handle_shapes() access
shape_type without catching this, crashing the entire conversion.
Add a _safe_shape_type() helper that returns None on
NotImplementedError, so unrecognized shapes skip only the GROUP
recursion and PICTURE extraction while text and table extraction
proceed normally.
Fixes #3308
Signed-off-by: Tejas Patel <tejas226@hotmail.com>
* Fix lint
Signed-off-by: Tejas Patel <tejas226@hotmail.com>
---------
Signed-off-by: Tejas Patel <tejas226@hotmail.com>1 parent 8ec14f2 commit 043ed2d
File tree
6 files changed
+575
-2
lines changed- docling/backend
- tests
- data
- groundtruth/docling_v2
- pptx
6 files changed
+575
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
688 | 688 | | |
689 | 689 | | |
690 | 690 | | |
| 691 | + | |
| 692 | + | |
| 693 | + | |
| 694 | + | |
| 695 | + | |
| 696 | + | |
| 697 | + | |
| 698 | + | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
691 | 704 | | |
692 | 705 | | |
693 | 706 | | |
694 | 707 | | |
695 | 708 | | |
696 | | - | |
| 709 | + | |
697 | 710 | | |
698 | 711 | | |
699 | 712 | | |
| |||
716 | 729 | | |
717 | 730 | | |
718 | 731 | | |
719 | | - | |
| 732 | + | |
720 | 733 | | |
721 | 734 | | |
722 | 735 | | |
| |||
Lines changed: 18 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
0 commit comments